oo 




(( 












Volume 160 
Number 3 



January 29 , 2Q1 5 



www.celLoom 









: 






Leading Edge 

In This Issue 



Tumor Sends Kinase on Food Run 

PAGE 393 

The tumor microenvironment is a rich source of metabolic substrates that can 
be utilized by cancer cells. Loo et al. find that colon cancer cells secrete a kinase 
that acts extracellularly to generate one such metabolite, phosphocreatinine, 
that directly fuels tumor growth and metastasis to the liver. Inhibiting this kinase 
with small molecules or adenoviral vectors suppresses metastatic colonization 
in mice and may prove beneficial in humans. 

Transcription: The Director’s Cut 

PAGE 367 

Puc et al. report that Topoisomerase 1 nicking function is required for robust 
enhancer activation. DNA nicking is necessary to relieve torsional stress and 
promote eRNA synthesis, revealing an unexpected connection between tran- 
scription and the DNA damage repair response. 

Signals Entrained Above (and with) the Noise 

PAGE 381 

Noisiness in molecular systems has been considered detrimental for signal transduction. Now, however, Kellogg et al. show 
that noise can work in concert with signal oscillations to control the transcriptional activity of NF-kB. Intrinsic biochemical 
noise in individual cells promotes oscillation of NF-kB entrained to fluctuating cytokine signals, and cell-to-cell variability in 
NF-kB levels creates population robustness. Together, the two types of transcriptional noise enable signal entrainment 
over a wider range of dynamic inputs. 

Argonaute’s Partner in Silencing 

PAGE 407 

Argonaute proteins typically cleave mRNA targets during RNAi, but their slicing activity is not required for silencing in 
C. elegans. Tsai et al. identify a ribonuclease that is recruited by Argonaute and promotes cleavage and uridylation of 
RNAi targets, which are then used as templates for siRNA amplification. The findings suggest how Argonautes promote 
both mRNA silencing and amplification of the silencing signal. 

Latency in Quiet Places 

PAGE 420 

It has been recently proposed that latent FIIV often integrates in or near cell-cycle or cancer-related genes, leading to clonal 
expansion and persistence of CD4’^ T cells. Now Cohn et al. show that clonally expanded T cells chiefly contain defective pro- 
viruses and thus do not form a biologically active latent reservoir. Instead, the reservoir resides in quiescent cells containing 
single integrations within silent genes or intergenic regions. 

Bespoke Antibodies for HIV 

PAGE 433 

While antibodies generally neutralize viruses by bivalent binding to neighboring 
virion spikes, FIIV-1 spike architecture prohibits intra-spike crosslinking by 
naturally occurring antibodies. Galimidi et al. develop a class of engineered 
antibody-based molecules, rationally designed for high avidity intra-spike bind- 
ing, that overcomes this barrier and presents over 100-fold increase in FIIV-1 
neutralization potency. 



The Inflammatory Virome 

PAGE 447 

The interplay between bacterial diversity and composition and enteric disease 
is increasingly understood, yet little is known about the effects of the virome. 
Norman et al. find that the enteric virome is perturbed in inflammatory bowel 
disease patients and uncover specific features of the viromes in Crohn’s dis- 
ease and ulcerative colitis that are not explained by changes in bacterial diver- 
sity and richness. 
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Healthy Competition between Cells 

PAGE 461 

Damaged cells can accumulate during development and aging. Merino et al. 
now identify a new Drosophila gene, azot, as the key factor ensuring the elim- 
ination of these less fit cells. These findings enable the authors to demonstrate 
that direct cell-to-cell fitness comparison and selection is essential for maintain- 
ing organismal health and extending lifespan. 

Mice Not Missing MYC (at all!) 

PAGE 477 

Hofmann et al. report that reduced expression of MYC increases lifespan in 
mice and makes their aging tissues healthier too. These benefits occur without 
apparent developmental trade-offs or significant changes in stress manage- 
ment pathways. Rather, they appear to be mediated by a unique combination 
of changes in core nutrient and energy sensing pathways. 



Kinase-cum-Tumor Suppressor 

PAGE 489 

Protein kinase C (PKC) inhibitors have been unsuccessful in clinical trials. Bioinformatic, genetic, and biochemical analyses of 
human cancer-associated PKC mutations by Antal et al. reveals that the majority are loss-of-function, uncovering an unex- 
pected role for PKC as a tumor suppressor. Since reduction of PKC function enhances tumor growth, therapeutic strategies to 
restore rather than inhibit PKC activity should be pursued. 



Light Touch and Go 

PAGE 503 

Different kinds of sensory touch input (stretch, vibration, heat) are processed by different sets of neurons. Bourane et al. iden- 
tify a class of spinal cord interneurons essential for transmitting light touch input from receptors in the skin. These neurons 
additionally integrate sensory touch information with motor control input from the brain to generate the corrective motor 
movements needed to balance when walking on uneven surfaces. 



Seek, Consume, or Binge 

PAGE 516 and PAGE 528 

Animals need to seek out and then consume food. Jennings et al. define the subset of neurons in the lateral hypothalamus 
responsible for encoding the two behaviors. Combining optogenetics and calcium imaging to measure neuronal activity at 
single-cell resolution in freely behaving mice, they find that seeking and consuming behaviors are encoded by distinct neu- 
rons, suggesting that the two behavioral aspects can be dissociated. In a related paper, Nieh et al. describe the neural circuit 
loop that selectively controls compulsive sugar consumption but does not affect feeding necessary for survival, providing a 
potential target for compulsive overeating therapeutic interventions. 



Enhancers on Evolution’s Fast Track 

PAGE 554 

A comparative functional genomic analysis in liver samples from 20 mammalian 
species enable Villar et al. to identify that, in contrast to promoters, enhancers 
evolve rapidly and that recently evolved enhancers dominate the mammalian 
regulatory landscape. Interestingly, most of these enhancers arise from 
genomic regions present in the mammalian ancestors, rather than lineage-spe- 
cific expansions of repeat elements, and they are associated with genes under 
positive selection. 

Channeling a Transporter 

PAGE 542 

Excitatory amino acid transporters operate both as transporters and as anion- 
selective ion channels at synapses. Machtens et al. employ a combination of 
simulations and experiments with prokaryotic and mammalian glutamate trans- 
porter homologs to define the anion conduction pathway and elucidate how a 
class of active transporters can function as selective, gated anion channels. 
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Building a Superhero 



Innovation in the natural world inspired many budding and 
established scientists, providing insights into species’ adap- 
tation and the processes that drive it. More recently, and as 
our ability to manipulate materials expands, evolutionary in- 
ventions have also provided creative solutions for human 
challenges. The development of Velcro, designed to the like- 
ness of thistle’s barbed seeds (de Mestral, 1958), is an 
example of an invention where the idiosyncratic problem of 
seed dispersal translated into a universal solution for our 
need to keep things together. 

Nature-inspired designs, or biomimetics, can be also 
applied to less obvious problems. One recent example is 
the study from the laboratories of Tae-il Kim and Mansoo 
Choi describing an ultra-sensitive mechanosensor that can 
pick up faint physiological outputs such as blood pressure, 
heartbeat, or voice patterns (Kang et al., 2014). The sensor 
was inspired by the lyriform organ, a specialized exoskeletal 
structure found on the limbs of spiders, consisting of paral- 
lel slits of different lengths connected to the nervous sys- 
tem. This arrangement, reminiscent of the strings in a lyre 
and used for communication, allows spiders to pick up 
exquisitely fine vibrations from their surroundings. Kim 
and Choi mimicked the lyriform organ by layering nanome- 
ters-thick platinum on top of a viscoelastic polymer and 
generating cracks in the platinum and polymer layers. The 
cracks are the sensing organ — electrical conductance 
across them depends on the size of the gap, allowing mea- 
surement of the fine vibrations and pressure that distorts 
the layers. This wearable sensor is indeed able to pick up 
fine distinctions in speech or blood flow, and one can ima- 
gine a wide range of future applications where it could 
prove its worth. 

Arguably, such a “spidey sense,” no matter how sophisti- 
cated, won’t cut it for Spiderman. The ability to climb vertical 
surfaces would be a nice touch, and indeed, researchers 
from Mark Cutkosky’s lab accomplished this by optimizing 
an adhesive system inspired by geckos (Hawkes et al., 
2015). Geckos’ unoanny ability to defy gravity resides in 
spatula-shaped lamellae covering their footpads, which 
adhere to surfaces through van der Waals forces. A variety 
of biomimetic materials capturing their properties have 
been developed; however, the remaining challenge was 
scaling the amount of adhesive to human weight for safe 
and uninterrupted climbing. Cutkosky’s lab resolved this 
problem by developing a load-sharing method designed to 
ensure a uniform distribution of forces across the adhesive, 
enabling a human to climb a vertical glass surface with a 
hand-sized area of adhesive. 

Camouflage or ability to merge with the surroundings has 
long been on the wish list of the military industry and con- 
sumer product designers. It is in equally high demand in 
nature and is brought to perfeotion by cephalopods that 
can change coloration quickly by altering pigment-containing 
skin chromatophores. The laboratory of John Rogers 




recently recreated this process by combining several layers 
of synthetic materials and sensors, mimicking the elements 
of the cephalopod skin, and achieving autonomous color 
matching to the background (Yu et al., 2014). Such a system 
not only has diverse applications, but it also deconstructs the 
complexity of a natural organ. 

Ultimately, biomimetic advances like the ones described 
above show that harnessing designs honed by evolutionary 
forces rather than the human creative process may not only 
result in versatile solutions, but also inspire quests for 
enhancing human capabilities. 
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Donald Metcalf (1929-2014) 



Once more unto the breach, dear 
friends, once more. 

Or close the wall up with our 
English dead! 

In peace there’s nothing so be- 
comes a man 

As modest stillness and humility. 

But when the blast of war blows in 
our ears. 

Then imitate the action of the tiger: 
Stiffen the sinews, summon up the 
blood. 

—William Shakespeare 
(King Henry, Henry V) 

Donald Metcalf— Professor Metcalf to 
those younger colleagues who were 
meeting him for the first time and Don to 
everyone else— was one of the fathers of 
modern hematology, shaping the field 
for the 60 years of his working life (1954- 
201 4) at The Walter and Eliza Hall Institute 
of Medical Research. 

Don was born in Mittagong, Australia, 
a little more than 100 km southwest of 
Sydney, on February 26, 1929, just prior 
to the start of the depression. Don’s 
parents were schoolteachers, and his 
childhood was nomadic, moving with 
them from school to school in country 
New South Waies. As one 
might expect, this environ- 
ment instilled the self-reliance, 
resilience, work ethic, and 
appreciation for education 
that characterized Don’s 
entire life. Following school, 

Don entered Sydney Univer- 
sity as a medical student 
in 1946 and completed a 
research year in the laboratory 
of Pat De Burgh, working on 
ectromelia virus. That year 
of research whetted Don’s 
appetite, and after graduating 
in 1 953, Don was awarded the 
Carden Fellowship of the Anti- 
Cancer Council of Victoria 
and moved to Melbourne in 
1954 to establish a cancer 
research laboratory at The 
Walter and Eliza Hall Institute 
of Medical Research, which 
was at that time led by the 



future Nobel Laureate Frank McFarlane 
Burnet. 

In an age when scientific superstars run 
mega-labs and often move from institu- 
tion to institution chasing ever more lucra- 
tive packages, the opposite was true of 
Don. Throughout his entire career, he 
was supported by the same fellowship 
and worked at the same institution. 
Despite his burgeoning reputation, he 
never left the laboratory bench and con- 
ducted experiments himself with the sup- 
port of one or two research assistants and 
an occasional graduate student. Having 
studied thymus development and lym- 
phomagenesis for a decade, Don’s epiph- 
any came in 1965, when with Ray Bradley 
at The University of Melbourne, he used 
semi-solid culture medium containing a 
crude source of growth factor to grow col- 
onies of granulocytes and macrophages 
from their precursors in the bone marrow. 
Don realized immediately that this tech- 
nique would not only permit him to map 
the hierarchical relationship between 
multipotential stem cells, committed pro- 
genitors, and their differentiating progeny, 
but would also allow him to define the hor- 
mones that regulated the process. Don 
worked on nothing else for the next 



50 years and, in doing so, made the blood 
cell system the poster child for under- 
standing tissue renewal. 

Don named the hormones that stimu- 
lated the growth of blood cells colony- 
stimulating factors, or CSFs. Unlike 
classic hormones, which are produced 
by a single tissue or cell type and act 
widely, Don found CSFs to be produced 
by almost every tissue he examined, but 
in vanishingly small amounts, and to act 
on a limited set of target cells. This re- 
sulted in a huge degree of skepticism in 
the scientific community; many thought 
that CSFs were in vitro artifacts that 
were of little importance to the regulation 
of blood cell production in vivo. Skepti- 
cism was the spur that Don relished. Re- 
cruiting a series of protein chemists and 
molecular biologists to the collaborative 
cause, including Richard Stanley, Tony 
Burgess, Nicos Nicola, Ashley Dunn, and 
Nick Gough, Don orchestrated a 30-year 
collaboration that saw the purification to 
homogeneity of four colony-stimulating 
factors: granuiocyte CSF (G-CSF), 

macrophage CSF (M-CSF), granulocyte- 
macrophage CSF (GM-CSF), and multi- 
CSF, which is now better known as inter- 
leukin-3. 

The purification of the CSFs 
led to the cloning of their 
genes— GM-CSF by Don’s 
team in Melbourne and 
G-CSF, M-CSF, and IL-3 by 
others— and ultimately led to 
the mass production of 
CSFs, allowing Don to do the 
experiment about which he 
had dreamed for 30 years: 
injecting pure recombinant 
CSFs into animals and seeing 
whether blood cell produc- 
tion becomes elevated. The 
answer was an emphatic 
yes, and these in vivo experi- 
ments paved the way for the 
successful use of G-CSF and 
GM-CSF in elevating the level 
of white cells in patients un- 
dergoing chemotherapy for 
cancer to reduce the likeli- 
hood of infection due to 
leukopenia. It was during 
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these clinical trials that Don and a post- 
doctoral fellow, Uli Duhrsen, discovered 
that, upon injection of patients with 
G-CSF, stem cells migrated from the 
bone marrow into the peripheral blood. 
This observation rapidly led to the use of 
G-CSF as an agent to mobilize stem cells, 
meaning that they could be obtained by 
simple blood collection rather than the 
more complicated and painful bone 
marrow harvest. This revolutionized bone 
marrow transplantation, making it safer, 
more effective, and more broadly appli- 
cable. It is estimated that more than 
20 million people have benefitted from 
Don’s discoveries, including the Spanish 
tenor Jose Carreras, who was an early 
recipient of the therapy. 

Don’s profound discoveries led to 
countless distinguished lectureships, 
awards, and prizes, including Fellowship 
of the Australian Academy of Science, 
The Royal Society and the US National 
Academy of Sciences, The Bristol-Myers 
Award for Distinguished Achievement in 
Cancer Research, The Robert Koch Prize, 
The Armand Hammer Prize for Cancer 
Research, The General Motors Cancer 
Foundation Sloan Prize, The Lasker-De- 
Bakey Clinical Medical Research Award, 
and a Lifetime Achievement Award from 
the American Association for Cancer 
Research. Although his profound discov- 
eries and many honors epitomize the sci- 
entist, they don’t paint a complete picture 
of the man. 

Don detested spin — he wanted his data 
to speak for itself and was scathing of re- 
searchers he termed “prancers” or “strut- 
ters,” who were more showmen than sci- 



entists. At conferences and in seminars, 
he would ask the incisive question or, if 
required, deliver the acerbic assess- 
ment— the more famous the researcher, 
the more pointed and colorful the criticism 
if the talk was not up to par. He was happy 
to send his papers to specialty hematolo- 
gy journals, and unless badgered by 
ambitious junior colleagues, Don es- 
chewed what he called “fancy” journals, 
which bethought required sensationaliza- 
tion of the story. 

Don was comfortable working in Mel- 
bourne, which for those in Boston, San 
Francisco, London, or Paris seemed like 
the end of the Earth. This tyranny of dis- 
tance was actually a great source of com- 
fort to Don, who could follow his own 
scientific compass without being blown 
off course by trends or fads. Don was 
deeply suspicious of researchers who, 
on achieving a level of success, retreated 
out of the laboratory into the safety of their 
office. He was a man who led from the 
experimental front for his whole life, which 
is why he so liked the “Once more unto 
the breach” speech of Henry V and titled 
his autobiography Summon Up The 
Blood. He worked five and a half days a 
week for most of the last 50 years, with 
his only concession to age being an 
occasional Saturday off and leaving 
work a little early to avoid the traffic in 
the evening. 

Other than a bad back, which was no 
doubt exacerbated by endless hours 
working at the microscope, Don remained 
in remarkable health until the middle of 
last year. Feeling “off color,” Don and 
his beloved wife of 60 years, Josephine, 



took a vacation, which he hoped would 
reinvigorate him for another decade of 
discovery. Returning in August feeling 
worse, Don suspected the worst and 
was quickly diagnosed with metastatic 
pancreatic cancer. Knowing the chance 
of cure was slim, Don had two concerns: 
to finish off the experiments awaiting his 
return from vacation and to spend as 
much time as possible with his family. 
But how to do both? The answer was to 
have his microscope moved from his lab- 
oratory to his dining room table at home 
and to do experiments when his chemo- 
therapy schedule allowed. Don carried 
out his last experiment in November, 
when his health began to decline precipi- 
tously. He died on December 15, 2014, 
surrounded by his wife Jo and their four 
daughters and their families. Don was an 
experimentalist and a family man to the 
end. He would have wanted it no other 
way. 
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The Metabolic Milieu of Metastases 
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To colonize the liver, colon cancer metastases must overcome hypoxia and other metabolic stress. 
Loo et al. now show that metastatic cells achieve this by decreasing miR-483 and miR-551 a expres- 
sion, which derepresses creatine kinase expression and allows energy to be captured from extra- 
cellular ATP through generation and import of phosphocreatine. 



It has long been appreciated that cancer 
cells exhibit altered patterns of meta- 
bolism. The century-old observation of 
aerobic glycolysis— or, the Warburg Ef- 
fect— has evolved into a more nuanced 
and context-specific understanding of 
how cancer cells reprogram their meta- 
bolism to meet the biosynthetic demands 
of rapid proliferation and overcome meta- 
bolic stress imposed by the microenviron- 
ment (Hanahan and Weinberg, 2011). 
In this issue of Cell, in an elegant series 
of experiments. Loo et al. (2015) now 
demonstrate that downregulation of spe- 
cific microRNAs (miRNAs) allows meta- 
static colon cancer cells to adapt meta- 
bolically to the harsh conditions in the 
liver by using a secreted liver metabolite 
to scavenge energy from the extracellular 
environment. 

Several lines of evidence have sug- 
gested key roles for miRNAs in can- 
cer progression and metabolism. Broad 
dysregulation of miRNA expression ac- 
companies tumorigenesis (Volinia et al., 
2006), while loss of specific miRNAs is 
associated with promotion of metastasis 
both experimentally and clinically (Tava- 
zoie et al., 2008). miRNAs also regulate 
key enzymes important for maintaining 
the cancer metabolic program. For ex- 
ample, MYC enhances cancer glutamine 
metabolism in part by suppressing miR- 
23a/b, ieading to upregulation of the 
target glutaminase (Gao et al., 2009), 
and mTOR activation promotes cancer 
glycolysis in part by suppressing miR- 
143 and upregulating its target, hexo- 
kinase 2 (Fang et al., 2012). Using two 



independent in vivo selection techniques 
in mice. Loo et al. demonstrate that 
silencing of miR-483 and miR-551 a, both 
of which target creatine kinase, brain- 
type (CKB), promotes metastasis of colon 
cancer cells to the liver. But how does 
derepression of CKB expression enhance 
metastatic colonization? 

The liver is considerably hypoxic and 
may have uneven levels of glucose in 
the interstitial space due to competition 
from neighboring hepatocytes. Because 
oxygen and glucose are key substrates 
for oxidative and glycolytic metabolism, 
colon cancer cells that metastasize to 
the liver experience metabolic stress. 
To overcome this stress, they capitalize 
on another metabolic feature of the 
liver— synthesis and secretion of crea- 
tine by hepatocytes. By suppressing 
miR-483 and miR-551 a, colon cancer 
cells upregulate expression of their 
common target, CKB, and secrete it 
into the extracellular space, where it 
catalyzes formation of phosphocreatine 
from creatine and ATP. Phosphocreatine 
is then imported back into colon cancer 
cells through the transporter SLC6A8, 
where it is used to regenerate ATP 
needed for cellular functions (Figure 1). 
Thus, by downregulating miR-483 and 
miR-551 a, metastatic colon cancer cells 
are able to bypass intracellular glyco- 
lytic and oxidative metabolism by acti- 
vating an alternative metabolic pathway, 
whereby they scavenge energy from the 
extracellular environment in the form of 
phosphocreatine and shuttle it into the 
cell. 



Emerging evidence suggests that there 
may be a high degree of organ speci- 
ficity in selection for metabolic traits 
enabling successful metastatic dissemi- 
nation. While Loo et al. define a metabolic 
selection mechanism for colon cancer 
colonization of the liver, previous work 
has shown that fatty acids secreted by 
adipocytes in the omentum fuel growth 
of ovarian cancer metastases expressing 
higher levels of fatty acid-binding protein 
4— a metabolic explanation for the predi- 
lection of ovarian cancer cells to metasta- 
size to the omentum compared to neigh- 
boring tissues and organs (Nieman et al., 
2011). Other sites of metastasis may 
also have unique macrometabolic proper- 
ties that enhance metastatic capacity of 
cancer cells with complementary meta- 
bolic proclivities. For instance, two com- 
mon sites of metastasis, the brain and 
the lungs (Nguyen et al., 2009), have 
high basal levels of glucose and oxy- 
gen perfusion, respectively, which may 
contribute to the high degree of metasta- 
tic colonization of these organs compared 
with others. 

Notably, loss of miR-483 and miR-551 a 
and increased expression of CKB, as well 
as SLC6A8, are observed in clinical sam- 
ples of colon cancer liver metastases 
compared to primary colon tumors, sug- 
gesting a possible therapeutic opportu- 
nity. Indeed, the authors show that viral 
rescue of both miR-483 and miR-551 a, 
as well as pharmacological inhibition 
of CKB with cyclocreatine, substantially 
decrease the extent of liver metastasis 
in a mouse model of metastatic colon 
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Figure 1. Phosphocreatine Fuels Metastatic Colonization of the Liver 

Reduced miR-483 and miR-551a levels in metastatic colon cancer cells liberates expression of 
their mutual target, brain-type creatine kinase (CKB). CKB secreted by the metastatic cells in the liver 
phosphorylates extracellular creatine produced by hepatocytes using extracellular ATP to generate 
phosphocreatine. Extracellular phosphocreatine is imported into metastatic colon cancer cells by the 
transporter SLC6A8. Phosphocreatine is then used to regenerate the ATP needed to sustain the cell. This 
alternative metabolic pathway using extracellular phosphocreatine enables metastatic cells to survive 
metabolic stress imposed by the liver microenvironment during initial organ colonization, before tumor 
vascularization delivers canonical metabolic substrates. 



cancer. Importantly, the reliance on phos- 
phocreatine as an energy source for liver 
metastases may not be limited to colon 
cancer. Loo et al. show that pancreatic 
cancer cells, which also metastasize to 
the liver, suffer a similar decrease in met- 
astatic capacity when CKB levels are 
depleted. Little has been reported on 
CKB levels in cancer, but it will be inter- 
esting to assess whether CKB levels 
correlate with a propensity to metastasize 
to the liver in other cancer subtypes. 

The findings by Loo et al. suggest that 
targeting phosphocreatine metabolism 
may reduce liver metastasis in colon can- 
cer patients. However, to successfully 
translate these findings to the clinic, it 
will be important to determine whether 
reliance of liver metastases on phospho- 
creatine metabolism changes at different 
stages of disease. Viral rescue of miR- 
483 and mlR-551a and pharmacological 
inhibition of CKB reduce metastatic tumor 
burden when administered within 24 hr of 



injecting mice with colon cancer cells, 
suggesting that targeting phosphocrea- 
tine metabolism could be of clinical utility 
for patients with early disseminated dis- 
ease to prevent development of liver me- 
tastases. But for patients with detectable 
secondary lesions, the value of targeting 
phosphocreatine metabolism remains 
unclear. Phosphocreatine metabolism 
may be most important for cancer cells 
to combat metabolic stress upon initial 
seeding of the liver, before expansion 
into a secondary tumor, when tumor- 
induced vasculature can deliver canonical 
metabolic substrates. Bolstering the idea 
that initial organ colonization and subse- 
quent tumor expansion require different 
metabolic programs, circulating breast 
cancer cells were recently shown to 
exhibit increased oxidative metabolism 
compared to the primary tumor and sec- 
ondary tumor outgrowths in the lungs 
(LeBleu et al., 2014). Additionally, liver 
metastases in colorectal cancer patients 



exhibit glycolytic metabolism, according 
to PET studies using the glucose analog 
^®F-fluorodeoxyglucose (Maffione et al., 
2015). 

Together, these data suggest that met- 
astatic cells may require a metabolic pro- 
gram focused on energy generation to 
survive metabolic stress upon initial colo- 
nization of a site, and may switch to a 
program using canonical metabolic sub- 
strates for growth post-vascularization. 
Understanding the dynamics of metabolic 
transitions between initial cancer dissem- 
ination, liver colonization, and metastatic 
outgrowth will aid selection of colon 
cancer patients who will benefit most 
from approaches to target phosphocrea- 
tine metabolism. 
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The Myc proto-oncogene has been intensively studied in tumorigenesis and development. A new 
paper in Cell reports the role of Myc as a determinant of mammalian longevity. Myc heterozygous 
mice exhibit extended lifespans resulting from alterations in multiple cellular processes distinct 
from those observed in other longevity models. 



Myc ranks among the most exhaustively 
studied genes in the vertebrate genome. 
The intense interest stems from its essen- 
tial and multiple roles in development and 
its critical functions in the etiology of 
most, if not all, cancers. The Myc protein 
is thought to function predominantly as 
a transcriptional regulator influencing 
the expression of thousands of genes 
and broadly promoting the fundamental 
cellular processes of growth, metabolism, 
proliferation, differentiation, and death in 
a context-dependent manner (for reviews, 
see Dang and Eisenman, 2014). While 
Myc has been intensively studied at the 
cellular level, relatively little is known con- 
cerning its roles in organismal physiology. 
In this issue of Cell, Sedivy and coworkers 
(Hofmann et al., 201 5) provide compelling 
evidence that mammalian longevity and 
healthspan are linked in multiple ways to 
the abundance of the Myc protein. 

By comparing wild-type mice {Myc*'*) 
with mice expressing roughly half the 
amount of Myc mRNA and protein 
{Myc*'^), Hofmann et al. found a 15% 
increase in median lifespan (combined 
for both sexes) caused by Myc hypomor- 
phism. Importantly, the attenuated mor- 
tality is manifested across all ages, indi- 
cating that the impact of lowered Myc 
levels is not restricted to a specific period 
of time. This augmentation in lifespan ap- 
pears to be accompanied by a diverse 
group of phenotypes that are indicative 
of an increased healthspan and correlate 
with other long-lived mouse models, 
such as calorie restricted (CR) and Ames 
dwarf mice (Gems and Partridge, 2013). 
These phenotypes include decreases in 
body size, serum IGF1 levels, immunose- 
nescence, spontaneous tumor progres- 
sion, osteoporosis, and serum and liver 
cholesterol content, as well as enhanced 



metabolic rate and neuromuscular perfor- 
mance, all of which are characteristic of 
young animals and are observed in older 
Myc hypomorphic mice. While pheno- 
typic similarities between Myc*'^ and 
other lifespan-extending conditions are 
apparent, there are also some significant 
differences. For example, in the Myc 
hypomorphic mice, no changes are de- 
tected in body temperature, adiposity, 
or fertility, while all three are changed 
in other mouse models of increased 
longevity. Moreover, meta-analyses of 
gene expression profiles derived from 
mice with extended lifespan due to CR 
or metformin treatment reveal only few 
changes in common with those from the 
Myc hypomorphic mice. Together, these 
data suggest that lower Myc levels pro- 
mote lifespan extension through a unique 
constellation of molecular events. In 
addition, as the authors point out, major 
consequences of decreased Myc may 
be indirect and not necessarily apparent 
at the transcriptional level. For instance, 
diminished rates of translation or sup- 
pression of mTOR activity have been 
linked to lifespan extension in flies, yeast, 
and nematodes (Gems and Partridge, 
2013). 

Previous studies in Drosophila mela- 
nogaster have implicated members of 
the Myc transcriptional network in the 
regulation of aging. Loss-of-function al- 
leles of the Drosophila Myc antagonist, 
dMnt, increase cell size and body weight 
while decreasing fly lifespan (Loo et al., 
2005). More recently. Drosophila Myc 
has been shown to act as a rheostat 
for longevity, as its overexpression di- 
minishes and heterozygous deletion aug- 
ments longevity in flies (Greer etal., 2013). 
These phenotypes appear to be indepen- 
dent of apoptosis and correlated with 



dMyc-induced genomic instability. In the 
Myc*'^ mice, while apoptosis also ap- 
pears unaffected, markers of DMA dam- 
age and senescence are unchanged and 
cells derived from these mice are equally 
sensitive to toxic chemicals compared to 
the wild-type controls, suggesting that 
the increased longevity and healthspan 
of the Myc hypomorphic mice are not 
likely to be due to resistance to genomic 
instability, oxidative stress, or cellular in- 
sults. However, it is yet to be determined 
whether the increased healthspan and 
lifespan could be due to a decreased fre- 
quency of insult. 

Potential candidates for driving the 
longevity phenotype include diminished 
serum IGF1 levels and increased meta- 
bolic rate, which are shared by Myc*'^ 
mice and the somatotropic mutant mouse 
strains, such as the Ames dwarf mice, and 
the CR mice (Lee and Longo, 2011). 
Importantly, IGF1 regulates mTOR activity 
via the PI3K-AKT pathway, and Myc*'^ 
mice exhibit decreased IGF1 and mTOR 
activity, as well as decreased rRNA pro- 
duction and translation rate. In summary, 
these mice appear to display better pro- 
teostasis. Consistent with these findings, 
treatment with the mTOR inhibitor ra- 
pamycin or haploinsufficiency of mTOR 
components or targets has been shown 
to lead to increased lifespan (Johnson 
etal., 2013). 

Alterations in metabolism, as evi- 
denced by enhanced oxygen-consump- 
tion and mitochondrial content and func- 
tion, have also been associated with 
longevity. In Drosophila, expression of 
the PGC-1 transcriptional coactivator in 
intestinal stem cells (ISOs) results in stim- 
ulation of mitochondrial biogenesis and 
oxidative phosphorylation, which limits 
ISC proliferation and misdifferentiation. 
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Figure 1. Myc Levels Dictate Tissue Homeostasis, Organismal Size, and Lifespan 

Mice lacking the Myc gene (Myc^'^) exhibit embryonic lethality, whereas compared with wild-type 
(Myc*'*) controls, Myc*'^ mice exhibit enhanced healthspan and lifespan. The lower abundance of Myc 
protein in Myc hypomorphic mice results in a reduction in the number of proliferative progenitors and in 
the organ size, possibly by suppressing stem cell exhaustion and limiting the number of defective or 
misdifferentiated cells within the tissue, thereby maintaining tissue homeostasis. 



attenuates an age-dependent loss of tis- 
sue homeostasis, and promotes mainte- 
nance of a functional stem cell popula- 
tion characteristic of younger organisms 
(Rera et al., 2011). Thus, metabolic 
changes may help to maintain tissue 
integrity by imposing a differentiation 
pace that does not deplete somatic 
quiescent stem cells. It has been well 
established that Myc exerts a profound 
effect on metabolic activity and mito- 
chondrial function and regulates the 
size of stem, progenitor, and differenti- 
ated cell compartments (Dang and Ei- 
senman, 2014; Freije et al., 2014; Lau- 
rent! et al., 2008). Indeed, Hofmann 
et al. detect a higher ratio of quiescent 
long-term to proliferative short-term he- 
matopoietic stem cells in Myc hypomor- 
phic mice. This finding suggests that 
the loss of one Myc allele might be suffi- 
cient to limit the maximal abundance of 



Myc protein attained during stem and 
progenitor cell proliferation and differ- 
entiation, thereby dampening stem cell 
depletion, curtailing progenitor cell over- 
growth, and positively affecting homeo- 
stasis of multiple cell lineages (Figure 1). 

Like most important findings, the work 
of Hofmann et al. on /Wye’s role in 
longevity raises more questions than it 
answers. One outstanding question con- 
cerns the tissues that are most directly 
relevant to longevity in the Myc hypomor- 
phic mice. Is the longevity phenotype due 
to changes in multiple tissue types, or 
does one tissue predominate by trig- 
gering a cascade of events that ultimately 
extends both lifespan and healthspan? In 
particular, do nervous system-specific 
changes that lead to increased neuro- 
muscular fitness and metabolic alter- 
ations contribute to better healthspan? 
Is the observed decrease in translation 



rate sufficient to account for the decrease 
in cholesterol biosynthetic enzymes and 
the lower production of IGF1 in the liver 
and serum? Could enhanced proteostasis 
resulting from diminished translational 
stress affect every tissue where protein 
quality control is important? Another 
interesting question is whether other 
members of the extended Myc network 
also play a role in regulating aging. It is 
evident that the more we know about 
even such a well-studied gene as Myc, 
the more we have to learn. 
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SUMMARY 

The discovery that enhancers are regulated tran- 
scription units, encoding eRNAs, has raised new 
questions about the mechanisms of their activation. 
Here, we report an unexpected molecular mecha- 
nism that underlies ligand-dependent enhancer acti- 
vation, based on DMA nicking to relieve torsional 
stress from eRNA synthesis. Using di hydrotestoster- 
one (DHT)-induced binding of androgen receptor 
(AR) to prostate cancer cell enhancers as a model, 
we show rapid recruitment, within minutes, of DMA 
topoisomerase I (TOPI) to a large cohort of AR-regu- 
lated enhancers. Furthermore, we show that the DMA 
nicking activity of TOPI is a prerequisite for robust 
eRNA synthesis and enhancer activation and is 
kinetically accompanied by the recruitment of ATR 
and the MRN complex, followed by additional com- 
ponents of DNA damage repair machinery to the 
AR-regulated enhancers. Together, our studies 
reveal a linkage between eRNA synthesis and 
ligand-dependent TOP1 -mediated nicking — a strat- 
egy exerting quantitative effects on eRNA expression 
in regulating AR-bound enhancer-dependent tran- 
scriptional programs. 

INTRODUCTION 

Research over the past few years, supported by data from GRO 
sequencing (GRO-seq) analysis and the ENCODE project, has 
revealed that most developmental and regulatory transcriptional 
regulation programs are controlled by an extensive enhancer 
network (Kim et al., 2010; Shlyueva et al., 2014), with each cell 
type estimated to harbor 70,000-100,000 enhancers, located 
upstream and downstream of coding target gene promoters 
(Pennacchioet al., 2013). Enhancer signatures include monome- 
thylated H3K4 (H3K4me1) and H3K27-acetylated histones (Kim 
et al., 2010; Li et al., 2013a; Wang et al., 201 1). These enhancers 
are usually characterized by a nucleosome-depleted core region 
where many of the cooperating transcription factors bind (Ander- 
sson et al., 2014; Hah et al., 2013; Kaikkonen et al., 2013; Lai 
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et al., 2013; Lam et al., 2013; Li et al., 2013a; Melgar et al., 
2011; Melo et al., 2013; Mousavl et al., 2013). Most surprisingly, 
enhancers are also transcription units, wherein their effect on 
target coding genes correlates with the transcription of the 
IncRNAs, referred to as eRNAs (Andersson et al., 2014; De Santa 
et al., 2010; Hah et al., 2013; Kaikkonen et al., 2013; Kim et al., 
2010; Lai et al., 2013; Lam et al., 2013; LI et al., 2013a; Melgar 
et al., 2011; Melo et al., 2013; Mousavl et al., 2013) adding a 
new layer of regulation to the fundamental mechanisms underly- 
ing enhancer action (Lam et al., 2014; Natoli and Andrau, 2012). 

The current prevailing belief, based on chromosome capture 
assays, where looping constraints are inferred from interaction 
frequencies between a point of interest and distal loci of the 
genome, is that the main mechanism by which enhancers affect 
their target gene expression is through chromatin looping. 
eRNA transcripts seem to be functionally important by contrib- 
uting to the stabilization of juxtaposed enhancer-target gene 
promoter loops to allow for optimal gene expression (Lai et al., 
2013; Li et al., 2013a). However, both eRNA synthesis and nucle- 
osome depletion are potential sources of topological strain on 
enhancers that can possibly hinder transcription. The move- 
ment and rotation of RNA polymerase complex (RNAP) along 
DNA template during the process of RNA synthesis (Liu and 
Wang, 1987) can generate positive supercoils in front of the 
advancing RNAP, and negative supercoils behind it (Darzacq 
et al., 2007; Kouzine et al., 2013; Kouzine and Levens, 2007; 
Liu and Wang, 1987). Because RNA polymerase is a powerful 
torsional motor, it can alter DNA topology by creating DNA 
supercoils, which can propagate and affect transcription elonga- 
tion (Ma and Wang, 2014). Although negative supercoiling can 
initially facilitate transcription initiation, either by helping RNAP 
to form an open complex or by helping to recruit transcription 
factors (Ma and Wang, 2014), it can subsequently lead to the 
generation of R-loops resulting from hybridization of nascent 
RNA to the DNA strand that is being transcribed, which, in 
turn, can impede transcriptional elongation (El Hage et al., 
201 0). Positive or overwound supercoiling can prevent transcrip- 
tion initiation and greatly diminish mRNA synthesis (Ma and 
Wang, 2014). Moreover, the very depletion of histones from the 
core region of enhancers releases unconstrained negative 
supercoils, which can impede transcription factor binding. One 
mechanism that resolves the undesirable effects of excessive 
supercoiling employs DNAtopoisomerases, including topoisom- 
erase I (TOPI). TOPI can relax both negative and positive 
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Figure 1. TOP1 Occupies AR Enhancers and Affects the Transcriptional Program of the Prostate Cancer Cell Line LNCaP 

(A) Recruitment of AR and TOP1 to the KLK3 and KLK2 enhancers. The highest TOPI binding is detected at 1 5 min DHT treatment. Data points show mean ± SD 
(n = 3), *p < 0.05, **p < 0.01 . 

(B) The UCSC genome browser screenshot of the KLK3-KLK2 iocus showing the occupancy of p-S5-RNA Poi il (Pol ii), AR, and TOP1 (aii tested with and without 
DHT treatment). 

(legend continued on next page) 
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supercoils by transient singie-strand breaks for the passage of 
individuai DNA strands through one another, foilowed by the 
rejoining of the phosphodiester backbone of DNA (Pedersen 
et ai., 2012; Pommier et ai., 2006). 

Aithough TOPI activity is weii estabiished in DNA repiication, 
its potentiai functionality in enhancer activation and transcrip- 
tionai initiation remains unciear. Most of the experiments hitherto 
examining the roie of TOPI in transcription have been iimited 
to artificiai promoter modei systems, which, if anything, have 
argued that TOPI DNA nicking activity is not invoived in 
transcriptionai activation in such in vitro systems (Kretzschmar 
et ai., 1993; Merino et ai., 1993; Shykind et ai., 1997). 

However, the utiiization of a nicking strategy for transcriptionai 
initiation and enhancer-regulated events would be in concert 
with the elegant explication of the moiecuiar mechanisms under- 
iying the expression of bacteriophage T4 iate genes, with the 
participation of DNA-mounted activator of transcription, gp45 
and RNAP-bound gp33. Here, a nick in the strands of the DNA 
and the actions of an exonuciease are required, with the DNA 
tempiate singie-strand nicks being essentiai for transcriptionai 
activation and the nicked-DNA gp45-loading site iocated up- 
stream or downstream of its target site (Herendeen et ai., 
1992). Aiso, in human ceiis, artificiaily generated nicks (but not 
doubie-strand DNA breaks) have recentiy been found to be 
associated with transcription (Davis and Maizeis, 2014). 
Together, these and other experiments in prokaryotes and 
eukaryotes suggest an intriguing iink between DNA nicking and 
transcription, but the mechanism and the factors involved 
remain iargeiy unknown. 

Here, we describe a moiecuiar mechanism that operates at 
functionai androgen-reguiated enhancers and identify DNA 
topoisomerase I as a criticai DNA-nicking enzyme invoived in 
the process of ceii-specific, iigand-driven enhancer activation. 
Recruitment of TOPI to these AR-bound enhancers is of func- 
tionai consequence as knockdown of the enzyme in the prostate 
cancer ceiis resuits in inhibition of DHT-reguiated eRNA and 
many coding gene transcriptional targets. Additionally, we pro- 
vide evidence that recruitment of a significant repertoire of 
DNA damage response machinery occurs on these functionai 
enhancers, potentiaily to prevent undesirabie effects of persis- 
tent DNA damage. 

RESULTS 

TOPI Recruitment to AR-Regulated Enhancers Affects 
eRNA and Coding Gene Expression 

To further investigate the mechanism of enhancer activation in 
iigand-reguiated transcription, we employed an early prostate 



adenocarcinoma celi iine, LNCaP, the growth of which is 
androgen-dependent (Horoszewicz et ai., 1980). The ceil line is 
exquisiteiy sensitive to androgen stimuiation and arrests in 
the G1 phase of the ceii cycie upon steroid depietion, despite 
the presence of peptide growth factors (Figure SI A). Reguiation 
of cyciin D expression and concomitant CDK4 activity repre- 
sents one mechanism by which androgen impinges on the ceii 
cycle to govern proliferation (Knudsen et al., 1998). 

To investigate whether TOPI piayed a roie in Iigand-reguiated 
transcription, we undertook to examine the possibie recruitment 
of TOPI to enhancers, finding that it was recruited to severai AR 
enhancers eariy in response to androgen (5a-dihydrotestoster- 
one [DHT]) treatment in the ligand-dependent LNCaP prostate 
cancer ceii iine (Figures 1A and SIB). These data prompted us 
to study genome-wide iocaiization of this protein by performing 
chromatin immunoprecipitation coupied with next-generation 
sequencing (ChiP-seq). Because exhaustive efforts to identify 
a TOPI antibody suitabie for ChiP-seq proved unsuccessfui, 
we generated a stabie LNCaP ceii iine with inducibie biotinyiated 
TOPI expression. We observed that TOPI recruitment in 
response to DHT, generated enriched regions of a range of sizes 
(Figure IB), as opposed to point sources, as found for factors 
such as the androgen receptor, or broad sources, such as 
observed for the H3K36me3 histone mark (Sims et ai., 2014). 
Consistent with the observation that enhancers represent regu- 
iated transcription units, we noticed a hormone-dependent in- 
crease in RNA Poi II (phospho-Ser5) occupancy predominately 
at these enhancers (Figures 1B and SIC). As expected, we 
aiso observed increased TOPI occupancy over promoters and 
gene bodies of the representative DHT-induced genes (e.g., 
KLK3 and KLK2), consistent with the possibiiity that TOPI might 
be invoived in both enhancer activation and transcriptionai eion- 
gation events. 

Preliminary anaiysis demonstrated that TOPI binding overiap- 
ped, in particuiar, with that of iiganded androgen receptor at 
enhancers (Figures IB and SIC). Genome-wide analysis re- 
veaied 6,545 putative “AR-bound enhancer” sites based on 
the criterion of an AR-bound iocus marked with H3K4me1 and 
H3K27AC and more than 1 kb away (in either direction) from 
the promoter of annotated genes of which 96% bound TOPI , 
with 3,921 (60%) exhibiting a DHT-stimuiated increase in TOPI 
binding (Figure SI D). 

To assess eRNAs induced by DHT, we took advantage of 
technoiogicai advances that permit mapping of the position, 
amount, and orientation of transcriptionaily engaged RNA 
poiymerase II on a genome-wide scaie (Core et ai., 2008). 
GRO-seq anaiysis (Core et ai., 2008) of serum-starved LNCaP 
ceils treated for 1 hr with DHT identified 644 putative AR 



(C) GRO-seq analysis of the effect of TOP1 knockdown on nascent RNA levels shown as a heatmap for 579 enhancers (out of 644, which were upregulated by 
DHT treatment) with the most affected AR enhancers at the top. 

(D) Heatmap showing DHT-induced TOP1 sequencing tags density increase around 644 AR-enhancer binding sites (centered on AR). 

(E) Box plot: siTOPI reduced transcription at ~80% of DHT-upregulated AR enhancers. *p < 2.2 x 10“^® (Wilcoxon test). 

(F) Box plot: the response to DHT of 368 DHT-upregulated genes was reduced after TOPI knockdown by siRNA. 

(G) Knockdown of TOP1 affects the induction of both eRNA and mRNA. LNCaP cells, hormone-starved for 1 day and transfected with the indicated siRNA, were 
stimulated with 1 00 nM DHT for 1 hr (eRNA) or 5 hr (mRNA) 48 hr posttransfection. qRT-PCR was performed with SYBR Green using reverse-transcribed RNA. 
Data represent mean ± SD (n = 3), **p < 0.01 . 

(H) Recruitment of ATR to the KLK3 and KLK2 enhancers following DHT stimulation of starved cells. Data represent mean ± SD (n = 3). **p < 0.01 . 

See also Figure SI and Table SI . 
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enhancers with significantly upregulated eRNAs (Figure 1C), 
which is the best mark of activated enhancers (Hsieh et al., 
2014; Li et al., 2013a), among which 477 (~74%) were noted to 
have increased TOPI occupancy in response to ligand at 
30 min (Figure ID), and virtually all appear to exhibit DFIT- 
increased TOPI binding at 15 min (Figure S5C). Because 
TOPI has been shown to affect the transcriptional activity of 
RNA Pol II (Kretzschmar et al., 1993), we decided to investigate 
whether knockdown of TOPI would alter eRNA synthesis from 
the androgen-regulated enhancers. Knockdown of endogenous 
TOP1 by small interfering RNA (siRNA) revealed that eRNA in- 
duction was reduced in at least 79% (507 of 644) of AR-regulated 
enhancers (Figures 1C and IE), accompanied by a decrease in 
the induction of 368 coding target genes in the experiment 
shown (Figure IF), with similar results in repeat experiments. 
Ninety-two percent of DHT-induced eRNAs were upregulated 
more than 2-fold, with fold change average of 7.6 times. Analysis 
of 1 00 randomly selected housekeeping genes not regulated by 
DHT in our GRO-seq experiments confirmed that the specific 
siRNAs used for this study had no effect on their expression (Ta- 
ble SI). 

To validate all major mechanistic points in this study, we chose 
four enhancers gene pairs. Three of these enhancers (KLK3E, 
KLK2E, and TMPRSS2E) are validated by previous studies 
(Andreu-Vieyra et al., 2011; Clinckemalie et al., 2013; FIsieh 
et al., 2014). The fourth one, NDRG1E, meets the criteria of 
others. It is an AR-bound element located not too far away from 
the NDRG1 gene TSS (-29 kb), it is H3K4mer, H3K27AC+, 
and following hormone stimulation the transcription unit pro- 
duces DHT-dependent, bidirectional eRNA, making it a strong 
candidate. Using these enhancer sites, we found that recruitment 
of the nuclear receptor coactivators (p300 and SRC-1) at AR en- 
hancers was diminished after siTOPI (Figure S1E). Thus, TOPI 
knockdown attenuated the induction of eRNA (1 hr DHT treat- 
ment) and the production of mRNA of the corresponding target 
genes 5 hr after ligand addition (Figures 1 G and SI F). Importantly, 
the fold induction (with or without DHT) was similar between inde- 
pendent experiments in which eRNA levels were measured. 
Surprisingly, we noted that ATR (Ataxia telangiectasia and 
Rad3-related), a protein involved in DNA damage repair, was re- 
cruited to AR-regulated enhancers at ~1 5 min following addition 
of ligand (Figures 1H and S1G). Together, these data identify 
TOPI -bound genomic regions that bear enhancer marks and 
produce eRNA in a DHT-dependent manner. Knockdown of 
TOP1 reduces production of eRNA and coding gene RNA for 
most of these AR-regulated target genes. 



NKX3.1 and TOPI Co-Occupy Enhancer Binding Sites 
and Regulate the AR Transcription Program 

NKX3.1 is an androgen-regulated transcription factor (Bhatia- 
Gaur et al., 1999), which is a highly selective and specific marker 
of metastatic prostatic adenocarcinoma (Gurel et al., 2010). 
NKX3.1 has been found to interact with TOPI to enhance forma- 
tion of the TOPI -DNA complex and increase TOPI nicking of 
DNA (Bowen et al., 2007). In fact, TOPI activity in prostates of 
Nkx3.1 and Nkx3.1^'^ mice is reduced compared with wild- 
type mice, but not in other organs that do not express Nkx3. 1 
(Bowen et al., 2007). Overlap of the reported NKX3.1 OhIP-seq 
data set (Tan et al., 2012) with that of AR and TOPI revealed 
that NKX3.1 occupancy was highest at AR enhancers, with 
NKX3.1 binding sites located over regions with increased 
TOPI binding (Figures 2A and S2A). We observed that AR and 
TOPI started to be recruited to AR enhancers within a few 
minutes after DHT stimulation. Interestingly, siRNA-mediated 
knockdown of cellular NKX3. 1 inhibited recruitment of TOPI at 
enhancers of DHT-regulated genes at 5 min following DHT stim- 
ulation (Figure 2B), in line with the previous data suggesting that 
NKX3.1 is needed for the formation of the TOPI -DNA cleavage 
complex (Bowen et al., 2007). We observed that, following 
NKX3. 1 knockdown in LNOaP cells, the DHT-dependent upregu- 
lation of ~70% enhancer eRNAs was significantly reduced (Fig- 
ures 20, 2D and S2C). We also noted significant reduction in the 
expression levels of 273 DHT-upregulated genes (Figure 2E), 
exemplified for two representative genes (Figure 2F). Addition- 
ally, knockdown of TOP1 and NKX3. 1 reduced DHT upregulation 
of eRNA at the same 351 AR enhancers in these experiments 
(Figure 2G), apparently without affecting AR recruitment to the 
enhancer-binding sites (Figure S2E). Together, these experi- 
ments demonstrate that NKX3.1 and TOPI binding occurs at a 
subset of DHT-regulated enhancers, and the knockdown of 
either diminishes transcription in response to ligand. 

Catalytic Activity of TOP1 Is Required for DNA Nicking 
and Enhancer Activation 

Based on its mechanism of action as a DNA nickase, by which 
TOPI forms a covalent intermediate with DNA and possesses 
intrinsic DNA ligase activity (Pommier et al., 1998; Ohampoux, 
2001), it would be difficult to detect any such transient nick by 
available methods. Indeed, despite extensive attempts to detect 
such a nick in enhancers by primer extension approaches, only a 
few examples could be clearly visualized. Thus, using this 
approach to investigate whether AR-regulated enhancers might 
be the sites of DNA scission by the activated TOPI , we chose the 



Figure 2. NKX3.1 and TOPI Co-Occupy a Subset of AR Enhancers and Co-Regulate the Enhancer Program 

(A) The UCSC genome browser screenshot displaying a direct overlap between AR, NKX3.1 , and TOPI binding at enhancers of KLK3 and KLK2 genes. Regions 
with increased (after DHT) TOP1 -binding (except regions present in the background control) are underlined. 

(B) Knockdown ot NKX3. 1 prevents TOP1 from binding at AR-regulated enhancers; siCTL-, siNKX3.1 and siTOPI -treated cells were stimulated with DHT for 
5 min. Chromatin immunoprecipitation was performed with an antibody against TOPI . Data represent mean ± SD (n = 3). **p < 0.01 . 

(C) Knockdown of NKX3. 1 by siRNA affects the induced transcription of ~69% of the regulated eRNAs. *p < 2.2 x 1 0“^® (Wilcoxon test). 

(D) Heatmap of AR enhancers sorted from most-to-least affected by siNKX3.1 . 

(E) siNKX3.1 reduces induced transcription of 273 genes in this experiment determined by GRO-seq. 

(F) The UCSC genome browser screenshot showing the KLK3-KLK2 locus. Knockdown of TOP1 or NKX3. 1 by siRNA reduces eRNA and genic RNA induction. 

(G) Knockdown of either TOP1 or NKX3. 1 affects induction of the same 351 eRNAs in the same experiment, as measured by GRO-seq. 

See also Figure S2 and Table SI . 
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Figure 3. TOP1 Recruits to AR-Regulated Enhancers and Nicks the DNA 

(A) UCSC browser screenshot displaying the KLK3 enhancer. Arrows indicate the PRO-caps representing (putative) eRNA TSS that flank the NKX3.1 peak. 

(B) eRNA readout assay showing that Tyr723 of TOP1 is required for eRNA induction. LNCaP cells were hormone-starved for 24 hr, transfected with siRNA to 
knock down TOP1, and then electroporated with empty expression vector (Veh), wild-type TOPI (WT), or the Y723F-TOP1 mutant (Mut) before treatment with 
either ethanol or DHT for 1 hr. eRNA for KLK2, KLK3, and TMPRSS2 gene enhancers was quantified by RT-PCR. TOP1 mRNA and protein levels are also shown. 
qPCR data show mean ± SD (n = 3). **p < 0.01 . 

(C) Knockdown of endogenous TOP1 affects nick/break formation as measured by incorporation of biotin 1 1 -dUTP at selected AR enhancers after 1 0 min DHT 
treatment. Data represent mean ± SD (n = 3). **p < 0.01. 

See also Figure S3 and Table S2. 



KLK3 enhancer as a model. We examined a region overlapped 
by the AR and NKX3.1 peaks and flanked by two “precision 
nuclear run-on transcription initiation sites” (PRO-caps), which 
mark the transcription initiation sites at high resolution (Kwak 
et al., 2013), noting that PRO-cap sites could be located on 
AR-regulated enhancers following hormone stimulation (Fig- 
ure 3A, Table S2). Primer extension analysis of both DNA strands 
with [y®^P]-ATP-labeled oligonucleotides yielded several termi- 
nation products consistent with a series of closely spaced 
DNA nicks; the strongest band that became accentuated in 
response to DHT was seen on the lower strand, in support of 
the notion that it may be one of the major TOPI binding/scission 



sites (Figure S3B). Moreover, detailed PRO-cap analysis to 
locate the precise start sites revealed that the RNA cap sites 
located on average ~1 34 bp away from the center of the AR 
peak (Figure S3C) are occupied by TOPI (Figure S3D) and, as 
shown in GRO-seq experiments, these transcripts continue to 
the end of eRNA-encoding sequence; however, for the majority 
(75*^ percentile) of these transcripts, the GRO-seq signal starts 
to fade away after 1 ,000 bp from the TSS/cap site. 

As another approach to infer the possibility of TOPI DNA nick- 
ase actions in activation of AR-dependent enhancer, we sought 
to mutate TOPI . TOPI enzymatic activity depends on Tyr723 
to relax superhelical DNA (Madden and Champoux, 1992). 
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Specifically, Tyr723 of T0P1 initiates the nucleophilic attack on 
the backbone scissile phosphate resulting in nicked DNA and a 
phosphodiester link between the tyrosine and 3' phosphate 
(Champoux, 2001; Pommier et al., 2010). Subsequently, the 
covalent intermediate is religated with concomitant release of 
Tyr723 from the DNA (Champoux, 2001; Stewart et al., 1998). 
We therefore tested whether the Y723F TOPI mutant could 
rescue the defect caused by TOP1 knockdown. For this pur- 
pose, endogenous TOP1 was knocked down with specific 
siRNA, and either the wild-type or the Y723F TOPI mutant was 
then expressed in LNCaP cells. Analysis of the enhancer RNA 
after 1 hr DFIT treatment revealed that the wild-type TOPI largely 
reinstated eRNA induction, whereas the catalytically inactive 
mutant failed to do so (Figure 3B). The incomplete rescue with 
the wild-type construct most likely reflected the fact that not all 
cells could be efficiently electroporated with the DNA expression 
vectors, probably because LNCaPs are notoriously difficult to 
transfect with conventional cationic liposome reagents. Interest- 
ingly, wild-type TOPI relaxes supercoiled DNA only in the pres- 
ence of NKX3.1, whereas the active site mutant does not at all 
(Bowen et al., 2007), consistent with the presence of TOPI on 
AR-bound enhancers. These findings are of particular interest 
based on previous in vitro transcription system analyses. TOPI 
has been shown to be essential for transcriptional activation in 
a system containing RNA polymerase II and other cofactors 
(Kretzschmar et al., 1993; Merino et al., 1993; Shykind et al., 
1997), but, in these artificial in vitro transcription systems, the 
Y723F mutant did not block the transcriptional activity of the 
complex at promoters. Therefore, in this context, TOPI was pro- 
posed to modulate transcription by changing the conformation of 
DNA at the promoter or via interactions with TBP/TFI ID (Kretzsch- 
mar et al., 1993; Merino et al., 1993; Shykind et al., 1997). In 
contrast, on AR-regulated enhancers, the nicking activity of 
TOPI appears to be required for its effects on eRNA transcription. 

Incorporation of labeled nucleotide by terminal deoxynucleo- 
tidyl transferase (TdT) has been considered to label both DNA 
nicks and double-stranded DNA (dsDNA) breaks (Gavrieli 
et al., 1992); hence, we also employed this assay on specific 
enhancer sites to assess incorporation of biotin 11-dUTP in 
response to DHT. Therefore, we fixed the cells with Streck Cell 
Preservative (Ju et al., 2006), a formulation shown not to cause 
DNA breaks during the fixation process. Biotin 11-dUTP incor- 
poration with TdT was observed at 10 min following addition of 
DFIT hormone at the several enhancers tested, and this was 
strikingly reduced after TOP1 knockdown (Figure 3C). Together, 
these data suggested that TOPI recruitment to enhancers co- 
occupied by AR and NKX3.1 occurred at regions proximal to 
transcription initiation sites and caused single-stranded DNA 
(ssDNA) nicks, although the possibility of a dsDNA break cannot 
be ruled out, especially as an unligated nick can be converted to 
a DSB for subsequent processing by the DSB repair pathway 
(Davis and Maizels, 2014). 

Involvement of MRE1 1 in the Regulation 
of the AR Program 

The MRN complex, composed of the meiotic recombination 1 1 
(MRE11), RAD50, and Nijmegen breakage syndrome 1 (NBS1), 
is central to the DNA damage response (DDR) pathway that is 



initiated upon recognition of the DNA breaks by sensor proteins 
(Stracker and Petrini, 2011). MRE11 regulates DNA repair by 
recruitment of DNA-repair proteins that load onto the chromatin 
at the site of the break (Price and D’Andrea, 2013). 

Recent evidence shows that cieavage of the covalent 3' phos- 
photyrosyl-DNA bonds that join TOPI to the DNA backbone by 
MRE11 generates a product carrying a 3'-phosphate end that 
MRE11-RAD50 can resect in an ATP-regulated reaction, pro- 
ducing a 3'-hydroxyl that can prime repair synthesis (Flamilton 
and Maizels, 2010; Sacho and Maizels, 2011). Interestingly, 
the p300 transcriptional coactivator physically interacts with all 
three members of the MRN complex (Jung et al., 2005). 

Based on these considerations and the results in Figure 3, we 
investigated whether MRE11 was present at AR-regulated 
enhancers. Kinetic ChIP experiments using a specific antibody 
(Figures 4A and S4A) revealed that MRE11 recruitment at 
enhancer-binding sites peaked at 15 min of DFIT treatment. On 
performing ChIP-seq, we identified 1 9,886 loci in the (-) hormone 
controi and 30,636 loci in the cells treated with DFIT for 15 min, 
observing that MRE11 sequencing tag density at enhancers 
increased with DFIT treatment (Figures 4B, 40, and S4B). We 
also observed similar recruitment of the RAD50 component 
of the MRN complex (Figure S4C). Oenome-wide analysis 
showed indistinguishable alterations in the number of tags over 
promoters of these genes in response to DFIT treatment, 
although a small increase in MRE11 occupancy at promoters 
of select DFIT-regulated genes (e.g., KLK3, KLK2, NDRG1, and 
TMPRSS2) could be detected by ChIP-qPCR after DFIT treat- 
ment (data not shown). ORO-seq analysis of nascent transcrip- 
tion revealed that induction of ^89% of detectible enhancer 
eRNAs induced by DFIT were inhibited by MRE11 knockdown 
(Figures 4D and 4E). In addition, expression of 510 induced 
coding genes was reduced (Figure 4F). Knockdown of RAD50 
caused a similar effect on eRNA and mRNA expression leveis 
(Figure S4D). Given the role of ATR in sensing single-strand 
DNA breaks, we also investigated the potential functional role 
of ATR following DFIT. We found that ATR is rapidly recruited, 
by 15 min, to AR-bound enhancers after DFIT (Figure 1FI). This 
is of functional significance, because knockdown of either 
MRE11 or TOP1 caused dramatic decrease in ATR recruitment 
to enhancers (Figure 4G) and a reduction of DFIT-induced 
enhancer and gene transcription (Figure 4FI). 

Recruitment of Components of DDR 
to AR-Regulated Enhancers 

Indeed, a mechanism that could be involved in the repair of sin- 
gle-strand nick would be the base excision repair pathway 
(BER), to process nicks that evaded TOPI ligase activity. There- 
fore, we investigated whether factors involved in this or other 
DNA damage repair pathways might also be recruited to AR- 
regulated enhancers. We performed kinetic ChIP experiments 
using antibodies against phospho-ATM (Ataxia telangiectasia 
mutated), Ku80 (part of the Ku heterodimer that binds to dou- 
ble-strand DNA break ends). Exonuclease 1 (EX01), the Bloom 
syndrome DNA helicase (BLM), and DNA ligase IV (LIGIV). 
Additionally, we used antibodies to proteins involved in the 
base excision repair pathway, including XRCC1 (X-ray repair 
cross-complementing protein 1), DNA polymerases p and e. 
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and Ligase I, observing an orderly and reproducible kinetics of 
recruitment after hormone treatment at enhancers including 
KLK3, KLK2, TMPRSS2, and NDRG1 (Figures 5 and S5A), as 
well on other DFIT-upregulated enhancers identified by the 
GRO-seq (Figure S5B). Although TOP1 and ATR were essentially 
recruited simultaneously at enhancers at 15 min, XRCC1 was 
recruited between 1 5 and 30 min, consistent with the recruitment 
of base excision repair pathway machinery that could process 
any unligated nicks. Interestingly, DNA ligase IV showed 
maximum occupancy after 30 min, whereas pATM (p-S1983), 
Ku80, EX01, BLM, and DNA ligase I were maximally recruited 
to enhancers ~60 min post DfHT treatment (Figures 5 and 
S5A), indicating recruitment of multiple DNA-repair factors that 
have been conventionally considered to function in DNA damage 
repair (Nimonkar et al., 2011). The sequence of events would be 
consistent with resolving any unligated DFITATOPI -induced 
ssDNA nicks; the DDR machinery primarily recruited as a “safety 
net” against any DNA breaks that are not sealed by TOPI . From 
our data, the machineries of transcription and DNA damage 
repair seem to be intrinsically linked. 

DISCUSSION 

Regulated gene expression has been a subject of intense inves- 
tigation over the past few decades, yet the precise mechanisms 
by which enhancers orchestrate tissue-specific programs with 
such an astonishing precision remain unclear. In particular, the 
finding that enhancers are also regulated transcription units, 
encoding eRNAs, has added to the mystery and raised new 
questions about how the subsequent topological strain on 
enhancers is handled. Both eRNA synthesis and nucleosome 
depletion at enhancers are potential sources of topological 
strain. Advancing RNA polymerase can generate both positive 
and negative supercoils. The amount of supercoiling is poten- 
tially enormous given that a positive and a negative supercoil is 
generated for every 10 bp transcribed and that the length of an 
eRNA transcript is typically 1-2 kbp in length. Indeed, it has 
been estimated that approximately seven supercoils may be 
generated by the transcribing polymerase per second, and that 
these supercoils can propagate >1 kbp from the transcription 
start site (Kouzine et al., 2013). At the same time, the depletion 
of histones from enhancers releases unconstrained negative 
supercoils, which, in principle, can parse to a change in DNA 
twist or unwinding to facilitate transcription and/or to a change 
in writhe that impedes transcription factor binding. To relieve 
torsional stress, it is tempting to predict that cells might employ 



actions of DNA topoisomerases, including topoisomerase I as an 
integral component of regulated enhancer transcription. 

Flere, we have elucidated the operation of just such a mecha- 
nism in prostate cell-specific enhancer activation by androgen 
receptor, using the LNCaP cancer cell line as a model. In a sense 
analogous to the role of TOPI at origins of replication (Simmons 
et al., 1998; Tsao et al., 1993), we show here that this DNA nick- 
ase is rapidly recruited to a large cohort of AR/NKX3.1 -occupied 
enhancers to putatively activate the enhancers and relieve 
torsional stress due to ongoing transcription (Figure 6). Our 
results are consistent with observations that, in yeast cells, 
Top1/Top2 play a role in the activation of genes characterized 
by high transcriptional plasticity (Pedersen et al., 2012). Flow- 
ever, the beneficial effects of TOPI have to be weighed against 
the negative effects of retention of TOPI as an obstacle to further 
transcription and the deleterious effects of a single-strand nick 
if it is not quickly sealed by TOPI itself, or repaired by the base 
excision pathway. Unrepaired nicks could lead to the formation 
of DNA double-strand breaks (DSB) as, for example, when a 
replication fork runs into and collapses at a nick (Kuzminov, 
2001; Wimberly et al., 2013). It has also been suggested that a 
codirectional collision between the replisome and backtracked 
RNA polymerase transcription elongation complexes leads to 
DNA double-strand breaks (Dutta et al., 2011). Thus, one impor- 
tant role for the MRN complex and other components of the 
DDR machinery that we observe recruited to the TOPI -bound 
enhancers might be for the removal of any “stalled” TOPI 
from the DNA substrate, as well as repair of any possible DNA 
breaks that might occur despite TOPI or the BER actions (Flam- 
ilton and Maizels, 2010; Sacho and Maizels, 2011; Davis and 
Maizels, 2014). 

TOPI activity is likely to be modulated by factors other than 
NKX3.1, suggesting that the mechanism we describe here may 
not be restricted to prostate cells. In this regard, it has been 
shown that the catalytic activity of TOPI is stimulated by large 
T antigen during unwinding of the SV40 origin (Simmons et al., 
1998) and overexpression of the antigen rendered LNOaP cells 
androgen-independent for cell-cycle progression (Knudsen 
et al., 1998). This raises the possibility that activation of TOPI 
catalytic activity may, in part, trigger a switch to androgen inde- 
pendence. The Werner syndrome helicase, WRN, has also been 
found to enhance the ability of TOPI to relax negatively super- 
coiled DNA and specifically stimulate the religation step of the 
relaxation reaction (Laine et al., 2003). It is therefore not unlikely 
that there exist other, yet undiscovered, activators of TOPI cat- 
alytic activity to regulate eRNA synthesis and gene expression 



Figure 4. MRE11 Regulates the AR Transcription Program 

(A) Recruitment of MRE1 1 to the selected DHT-regulated AR enhancers. Data points show mean ± SD (n = 3). *p < 0.05, **p < 0.01 . 

(B) MRE1 1 binding (sequencing tags density) increases over AR enhancers in a DHT-dependent manner (KLK3 and KLK2 genes shown). 

(C) Distribution of MRE11 and AR binding (sequencing tag density) centered over AR-enhancer binding sites with DHT-induced eRNA. 

(D) MRE11 knockdown reduces eRNA expression levels of 89% of DHT-upregulated eRNAs. *p < 2.2 x 10“^® (Wilcoxon test). 

(E) Heatmap for AR enhancers sorted from the most downregulated by siMRE1 1 at the top to the least, at the bottom. 

(E) Boxplot showing 510 genes, where DHT-induced upregulation of transcription (determined by GRO-seq) was reduced by MRE11 knockdown. 

(G) Knockdown of either MRE1 1 or TOP1 affects recruitment of ATR at enhancers foliowing hormone stimuiation of the starved ceils, measured after 1 5 min DHT 
stimuiation. Data show mean ± SD (n = 3). *p < 0.05, **p < 0.01. 

(H) slATR affects induction of eRNA (1 hr DHT) and mRNA (5 hr DHT treatment) of the corresponding gene. Data are the mean ± SD (n = 3). *p < 0.05, **p < 0.01 . 
See also Eigure S4 and Table SI . 
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Figure 5. Canonical DNA Damage/Repair Machinery Components Recruit to AR-Regulated Enhancers 

Kinetic recruitment of factors implicated in the DNA damage response (DDR) to AR enhancers. Ali kinetic ChiP experiments were performed at ieast twice with 
ceils of similar passage number to ensure data reproducibility. Data shown as mean ± SD (n = 3). *p < 0.05, **p < 0.01 . See also Figure S5. 



programs. Alternatively, there may be other DNA nickases that 
Initiate enhancer activation In tissues other than prostate, in 
signal-dependent manner, and that the activities of those 
nickases are modulated by enhancer-bound factors. 



Although the finding that ligand-dependent enhancer activa- 
tion strategy would Involve a DNA nick may seem counterintui- 
tive in terms of cellular integrity, it is noteworthy that cellular 
integrity Is threatened dally by endogenous and extracellular 
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Figure 6. A Model for TOP1 -Mediated Acti- 
vation of the AR Enhancer 

Following androgen stimulation, AR and DNA 
topoisomerase I recruit to the enhancer region, 
premarked by the NKX3.1 pioneer transcription 
factor. NKX3.1 to TOP1 stimulates enzymatic ac- 
tivity of the topoisomerase, resulting in nicking of 
DNA on a single strand, followed by recruitment of 
ATR, XRCC1 , and the MRN complex components 
(MRE11/RAD50). After dismissal of TOP1, ATR, 
and the MRN, additional components of DNA- 
repair machinery recruit to the activated enhancer. 
The thin blue line indicates the presence of low 
levels of residual eRNA, not totally eliminated by 
hormone starvation, whereas the thick blue line 
represents induced bidirectional eRNA produced 
by the transcription unit. 



agents that lead to the formation of single- and double-strand 
DNA breaks. For instance, the estimated number of single- 
strand breaks and spontaneous base losses in nuclear DNA 
together with other types of spontaneous damage may reach 
10^ lesions per cell per day (Hoeijmakers, 2009), yet the cells 
are programmed to survive. To maintain genomic integrity, cells 
constantly engage the DNA-repair machinery. As such, the us- 
age of a programmed DNA nicking/repair strategy in regulated 
transcription to relieve torsional stress and activate transcription 
in this case, while apparently surprising, is in keeping with 
growing evidence that components of DNA damage machinery 
do participate in transcriptional regulation. For instance. Rein- 
berg and colleagues demonstrated that human RNA polymerase 
II complex contains components with roles in DNA repair, 
including Ku70, Ku80, and DNA Pol e (Maldonado et al., 1996), 
and Kung and colleagues (Mayeur et al., 2005) have identified 
heterotrimeric DNA-dependent protein kinase subunits: Ku70, 
Ku80, and DNA-PKcs, as well as poly(ADP-ribose) polymerase 
as proteins associated with the C-terminal domain of AR and 
demonstrated that, in LNCaP cells, Ku70 and Ku80, recruited 
to the KLK3 promoter and enhancer in a hormone-dependent 
manner. Interestingly, Ku70 and Ku80 can function outside of 
the Ku heterodimer that loads on double-strand DNA breaks. 
Hasty and colleagues have shown that Ku80 deletion impairs 
the base excision pathway (BER) at the initial lesion recogni- 
tion/strand scission step, arguing that free Ku70 and free 
Ku80, but not the Ku heterodimers, associate with apurinic/apyr- 
imidinic (AP) sites that BER corrects (Li et al., 2013b; Choi et al., 
2014). Moreover, Mo and Dynan showed that, in normally 
growing human cells, Ku80 associated with RNA polymerase II 
elongation sites. This association occurred independently of 



the DNA-dependent protein kinase cata- 
lytic subunit and was highly selective. In 
addition, there was no detectable associ- 
ation with the initiating isoform of RNAPII 
or with the general transcription initiation 
factors. The authors concluded that as- 
sociation of Ku80 with transcription sites 
is important for maintenance of global 
transcription levels, because functional 
disruption of a discrete C-terminal domain in the Ku80 subunit in- 
hibited transcription in vitro and in vivo (Mo and Dynan, 2002). 
Importantly, LiglV, like Ku80, is commonly associated with the 
NHEJ pathway, but its active site has been found to be highly 
permissive and capable of ligating atypical DNA substrates, 
including nicks with gaps (Gu et al., 2007). Interestingly, in the 
absence of RNase H2, the suppression of mutations arising 
from misinsertion of ribonucleoside monophosphates (rNMP) 
during DNA replication involves Topi -mediated cleavage at an 
rNMP, followed by unwiding of DNA by Srs2 and digestion by 
Exol (Potenski et al., 2014). Also, earlier studies showed that 
TOPI enhanced TFIID-TFIIA complex assembly during activa- 
tion of transcription; however, in these biochemical studies, the 
catalytic activity of TOPI was not essential to activate transcrip- 
tion from promoters. It is also interesting to note that the AR itself 
has been shown to transcriptionally regulate a network of DNA- 
repair genes, including those implicated in DNA damage sensing 
(MRE11, NBN, and ATR), nonhomologous end joining (XRC04 
and XROC5), homologous recombination (RAD54B and 
RAD51C), mismatch repair (MSH2 and MSH6), base excision 
repair (PARP1 and LIG3), and the Fanconi pathway (FANG1, 
FANGC, and USP1) (Polkinghorn et al., 2013). Moreover, p53 it- 
self binds enhancers and regulates eRNA synthesis for transcrip- 
tion enhancement of neighboring genes (Melo et al., 2013). 

Together, the recruitment of DNA damage response machin- 
ery in specific transcriptional regulatory events is an emerging 
theme, from the regulation of pluripotency in embryonic stem 
cells by the trimeric XPC-nucleotide excision repair complex 
(Fong et al., 2011) to the regulation of human RARp2 gene via 
XPG induced DNA breaks at the promoter region (Le May 
et al., 2012). Moreover, experiments with yeast have revealed 
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that the Rad1 ^^'^/Rad1 Mms4^'^'^ orthologs can catalyze 
the endonucleolytic cleavage of DNA immediately upstream 
from the Top1-DNA adduct (Pommier et al., 2010). Indeed, 
permissive chromatin architecture seems to be a crucial require- 
ment for transcription initiation events (Fong et al., 2013). 
Although these events are quite distinct from the TOPI -depen- 
dent regulatory events described in the present manuscript, 
they do suggest a common usage of the DNA damage repair 
machinery to regulate gene transcription. 

EXPERIMENTAL PROCEDURES 
Cell Culture 

LNCaP cells were purchased from ATCC and maintained in RPMI-1640 
medium (Life Technologies) supplemented with 10% fetal bovine serum 
(Omega Scientific), 2 mM L-glutamine, and penicillin/streptomycin. For kinetic 
Chip experiments, cells were starved in phenol-free DMEM (Lonza) supple- 
mented with 5% charcoahdextran stripped fetal bovine serum (Omega Scien- 
tific) for 72 hr. Cells were synchronized with 2.5 |iM a-amanitin (Sigma) for 2 hr, 
washed twice with PBS, and released. A total of 1 00 nM 5a-Dihydrotestoster- 
one (DHT, Sigma) was added to the starvation media to stimulate the cells. 

Small Interfering RNA 

siRNA-mediated knockdown was achieved by transfecting cells with Lipo- 
fectamine 2000 and specific siRNAs. The following siRNAs were used for 
this study: AllStars Neg. Control siRNA (1027281) was from QIAGEN. Human 
ON-TARGETplus SMARTpool siRNAs against TOP1 (L-005278-00-0020), 
MRE1 1 (L-009271 -00-0020), and RAD50 (L-005232-00-0005) were purchased 
from Dharmacon. Single interfering RNAs targeting AR (SASI_Hs01_ 
00224483, SASI_Hs01_00224484), TOP1 (SASI_Hs02_00335354, SASI_ 
Hs01_00047440), ATR (SASI_Hs01_001 76270, SASI_Hs01_001 76271), and 
NKX3.1 (SASI_Hs02_00341026, SAS!_Hs01_00018365) were obtained from 
Sigma. Multiple siRNAs were used during the course of the study to confirm 
data reproducibility. 

For transfection, LNCaP cells were seeded on dishes in RPMI-1 640 supple- 
mented with 10% FBS and allowed to attach overnight. The following day, 
the cells were washed twice with PBS and fed with phenol-free DMEM supple- 
mented with 5% charcoahdextran FBS. One day later, the cells were 
transfected using Lipofectamine 2000 and 20 pmol ml“^ siRNA diluted in 
Opti MEM reduced serum media without phenol red (Life Technologies). The 
transfection media was removed after 16 hr incubation, and the cells were 
washed twice with PBS. Fresh, phenol-free DMEM supplemented with 5% 
charcoalidextran FBS and penicillin/streptomycin was added to the dishes. 
Cells were harvested 48-72 hr posttransfection. All siRNAs used in this study 
were validated by vendors or by us and used only if providing >70% knock- 
down efficiency. Relative quantities of gene expression level were normalized 
to the GAPDH gene. The relative quantities of ChIP samples were normalized 
by individual inputs, respectively. 

ChIP-qPCR 

Chromatin immunoprecipitation experiments were done as previously 
described (Garcia-Bassets et al., 2007). All ChIPs and qPCRs were repeated 
at least thrice and representative results were shown, p values were calculated 
by using a two-tailed Student’s t test. 

GRO-Seq and PRO-Cap 

Global run-on sequencing (GRO-seq) was performed as detailed (Wang et al., 
2011), and precision nuclear run-on sequencing of transcription initiation 
sites (PRO-cap) was performed as described (Kwak et al., 2013). 

Antibodies 

AR (N-20), TOPI (H-300), ATR (N-19), RAD50 (H-300), XRCC1 (H-300), BLM 
(H-300), DNA Ligase 1 (C-21), DNA Ligase IV (H-300), DNA POL 3 (C-21), 
DNA POL e 3/CHRAC17 (N-15), p300 (C-20), SRC-1 (M-341) were from 
Santa Cruz Biotechnology. MRE11 (Ab397) and p-S1983-ATM (Ab2888) 



were obtained from Abeam. Ku80 (A302-627A) and EX01 (A302-639A) 
were purchased from Bethyl Laboratories. See also Extended Experimental 
Procedures. 
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SUMMARY 

Cells must respond sensitively to time-varying inputs 
in complex signaling environments. To understand 
how signaling networks process dynamic inputs 
into gene expression outputs and the role of noise 
in cellular information processing, we studied the im- 
mune pathway NF-kB under periodic cytokine inputs 
using microfluidic single-cell measurements and sto- 
chastic modeling. We find that NF-kB dynamics in fi- 
broblasts synchronize with oscillating TNF signal and 
become entrained, leading to significantly increased 
NF-kB oscillation amplitude and mRNA output com- 
pared to non-entrained response. Simulations show 
that intrinsic biochemical noise in individual cells im- 
proves NF-kB oscillation and entrainment, whereas 
cell-to-cell variability in NF-kB natural frequency 
creates population robustness, together enabling 
entrainment over a wider range of dynamic inputs. 
This wide range is confirmed by experiments where 
entrained cells were measured under all input pe- 
riods. These results indicate that synergy between 
oscillation and noise allows cells to achieve efficient 
gene expression in dynamically changing signaling 
environments. 

INTRODUCTION 

Understanding how cells efficiently process information in 
rapidly changing and noisy environments is a fundamental prob- 
lem in biology. Cells experience environments that fluctuate 
overtime during physiological conditions such as inflammation, 
where oscillating input signals can occur due to pulsatile secre- 
tion of signaling molecules from immune cells (Goldbeter et al., 
1990; Han et al., 2012), propagating signaling waves (Faicke, 
2003; Schutze et al., 2011; Yde et al., 2011), or by coupling be- 
tween upstream pathways (Gerard and Goldbeter, 2012; Goldb- 
eter and Pourquie, 2008; Yang et al., 2010; Yoshiura et al., 2007). 
How cells process such dynamic inputs into functional gene 
expression outputs is not well understood. Further, signaling 
systems are subject to biochemical noise originating from sto- 
chastic molecular interactions, leading to system noise and 
cell-to-cell variability in response to input signals. While it is a 
common belief that noise is harmful to information processing 
(Cheong et al., 2011), cell signaling pathways perform with 

CrossMark 



remarkable robustness despite ever-present system noise and 
variability (Little et al., 1999). It is not clear how cell-signaling 
pathways overcome system noise and whether there are func- 
tional roles for noise in cellular information processing. 

Signaling systems often employ oscillatory network architec- 
ture to process environmental inputs (Levine et al., 2013). For 
example, specific transcriptional responses can be achieved 
by encoding the dose or identity of a constant input signal by 
modulating oscillatory response dynamics (Kupzig et al., 2005). 
Theoretically, oscillation can be advantageous also in the pro- 
cessing of fluctuating and noisy input signals. For example, dy- 
namic inputs that contain noise can be transmitted efficiently in 
an oscillating system through a phenomenon called stochastic 
resonance (Douglass et al., 1993), previously observed in 
neuronal circuits (McDonnell and Ward, 2011). Nevertheless, 
such a beneficial role for biochemical system noise in the 
processing of fluctuating environmental signals has not been 
shown. 

To study how oscillation and system noise may interact in pro- 
cessing of dynamic input signals, we consider the NF-kB sys- 
tem, a gene regulatory network central to immune functions 
and many diseases, including autoimmunity and cancer (Hayden 
and Ghosh, 2008). NF-kB pathway activation by TNF cytokine 
leads to oscillations in p65:p50 heterodimer localization between 
the cytoplasm and nucleus (Hayden and Ghosh, 2008; Hoffmann 
et al., 2002; Nelson et al., 2004), mediated by NF-KB-dependent 
induction of negative feedback genes of the kB family (Fig- 
ure 1A). Pathway activation through IKK under TNF occurs in a 
digital, switch-like fashion (Tay et al., 2010). 

NF-kB oscillations are subject to intrinsic and extrinsic noise, 
leading to variable timing between cells that obscures single- 
cell behavior in population analyses (Swain et al., 2002; Tay 
et al., 201 0). Sources of extrinsic noise include different signaling 
histories and uneven cell division leading to variation in protein 
abundance (Huh and Paulsson, 2011). Variation in TNF receptor 
or NF-kB molecules create different response characteristics 
between cells (Tay et al., 2010). Significant contributions to 
intrinsic (biochemical) noise in NF-kB include burst-like tran- 
scription of IkB and A20 negative feedback genes, and recep- 
tor-ligand interaction at low ligand concentration (Elowitz et al., 
2002; Tay et al., 2010). Another negative feedback gene IkBs is 
induced with a 45 min delay compared to IkB(x, which is opti- 
mally timed for increasing cell-to-cell oscillation variability, sug- 
gesting that transcriptional noise might provide a functional 
advantage (Ashall et al., 2009; Paszek et al., 2010). 

The function of NF-kB oscillation is not fully understood. Other 
pathways like p53 and Notch convert between oscillatory and 
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non-oscillatory response to achieve specific cell fate responses 
(Dolmetsch et al., 1998; Kageyama et al., 2008; Purvis et al., 
2012; Purvis and Lahav, 2013). The frequency of osciiiation in 
ERK, Crzl, and NFAT4 is aitered depending on input signai 
concentration, achieving expression controi across diverse pro- 
moters through frequency moduiation (Aibeck et ai., 2013; Ber- 
ridge et ai., 2003; Cai et ai., 2008; Doimetsch et ai., 1998; Eidar 
and Eiowitz, 2010; Shankaran et ai., 2009; Yissachar et ai., 
2013). However, NF-kB osciiiation frequency (90 to 100 min 
peak-to-peak interval) is unchanged across a wide range of input 
concentrations (Longo et ai., 2013; Tay et ai., 2010; Turner et ai., 

201 0) , and it is uncertain how NF-kB osciiiation changes and di- 
rects gene expression in response to a fiuctuating input. 

Osciiiatory systems can experience resonance, where a pe- 
riodic stimuius leads to amplified output (Abraham et ai., 
2010; Pikovsky et ai., 2003). Periodic input may aiso entrain 
or synchronize a popuiation of osciiiators so that ali osciiiators 
adopt the same frequency and phase. Entrainment ieading to 
resonant amplification of NF-kB osciiiations may occur for input 
signals that fluctuate at a rate similar to the NF-kB natural fre- 
quency, increasing the sensitivity of the NF-kB system espe- 
ciaiiy to smaii signals. Theoretical studies predict that periodic 
input to NF-kB may generate entrainment, quasiperiodic oscii- 
iations, or even chaos (Jensen and Krishna, 2012; Wang et ai., 

2011) . Entrainment, with prominent exampies from circadian 
rhythms and brain waves, aiiows osciiiatory signaling and tran- 
scriptionai pathways to synchronize and work in harmony (Re- 
ppert and Weaver, 2002; Vareia et ai., 2001). it is conceivabie 
that entrainment couid reduce ceii-to-ceii NF-kB osciiiation 
variabiiity, ieading to homogenous transcriptionai responses 
at the population ievel. Nevertheiess, experimental studies are 
iacking on whether NF-kB can experience resonance and 
entrainment, and whether there is impact on gene expression 
output and variabiiity (Longo et ai., 2013; Tay et ai., 2010; 
Turner et ai., 2010). 

Aithough noise is detrimentai to signai transmission in iinear 
systems, it can faciiitate information transfer in a non-iinear sys- 
tem by decreasing the ampiitude of a periodic input needed to 
achieve coupiing (Coiiins et ai., 1996; Lindner et ai., 2004; Mori 
and Kai, 2002; Zhou et ai., 2002). For exampie, input noise can 
facilitate sensory neuron processing (McDonneii and Ward, 
2011) and intrinsic noise may cause osciiiations to become 
more robust to perturbation (Paszek et al., 201 0; Perc and Marhl, 
2003; Vilaret al., 2002). Extrinsic noise (i.e., variation in signaling 
parameters between ceiis) may also impact entrainment due to 
increased population diversity, simiiar to bacteriai bet hedging 
when externai conditions change (Mondragon-Paiomino et ai., 
2011; Suei et al., 2006; Wakamoto et al., 2013). However, it is 
not known how noise couid affect entrainment of a compiex 
and physioiogicai mammaiian system such as NF-kB. 

To probe how osciiiation and noise together determine NF-kB- 
dependent transcription in dynamic settings, we used a micro- 
fiuidics-based experimental pipeline that enabied automated 
celi stimulation, iive imaging, and gene expression measure- 
ments (Gomez-Sjdberg et ai., 2007; Junkin and Tay, 2014; Kei- 
iogg et al., 2014; Tay et al., 2010). We delivered various TNF 
cytokine inputs to p65“'“ mouse 3T3 fibrobiast ceils expressing 
p65/DsRed fusion protein at near wiid-type ieveis (Lee et ai.. 



2009; Tay et ai., 2010) (Figures 1 B and 2A and 3A). The microfiui- 
dic chip utiiizes computer controiled PDMS membrane valves, 
allowing constant perfusion or periodic pulsing of signaling fac- 
tors. Ninety-six independent celi cuiture experiments each with 
compiex fiuidic conditions can be maintained in paraiiei. Ceii im- 
ages acquired at 5 min intervais were automaticaiiy anaiyzed, 
extracting thousands of singie-celi trajectories of NF-kB nuciear 
intensity overtime (Keiiogg et al., 2014). Following periodic TNF 
stimulation, ceiis were retrieved for gene expression anaiysis in a 
high-throughput microfiuidic qPCR system to understand the in- 
fiuence of entrainment on target gene expression (Keiiogg et ai., 
2014). Furthermore, we performed stochastic simuiations using 
an estabiished modei of NF-kB (Tay et ai., 2010) and varied 
both intrinsic and extrinsic noise to interpret our experimentai 
findings and understand the roie of noise and osciiiations for 
NF-kB dynamic signai processing. 

RESULTS 

Constant TNF Stimulation Generates Sustained, Noisy 
NF-kB Oscillations 

Previous studies of NF-kB dynamics were subject to TNF ligand 
loss due to degradation and celiular internaiization, ieading to 
damped osciiiations (Tay et ai., 2010). Here, constant TNF con- 
centration was achieved by perfusion of fresh TNF-containing 
media using an on-chip peristaltic pump (Figure IB). Linder 
constant TNF concentration, we observed NF-kB osciiiations 
sustaining ionger than 24 hr with mean period approximateiy 
90 min (Figures 1C and ID and Movie SI). Ceiis exhibited 
different naturai frequencies (between-ceii variability) and cy- 
cie-to-cycle timing fiuctuation (within-ceii variabiiity) (Figure 1 F). 
NF-kB osciiiation is robust to changes in dose, and iowered 
dose, which generates high receptor-ligand noise, modestiy 
iengthened the average period and increased period variability 
(CV: coefficient of variation, in Figure IE). These findings show 
that NF-kB osciiiation sustains under constant TNF input, point- 
ing to a conserved function for osciiiations. 

To understand how noise underiies osciiiation variabiiity, we 
performed simulations of NF-kB dynamics under constant TNF 
concentration. We used a stochastic singie-ceii modei, which 
faithfuily reproduces the NF-kB dynamics in singie 3T3 fibrobiast 
ceils used in this study (Tay et ai., 2010). The simuiations showed 
sustained osciiiation and period characteristics simiiar to our ex- 
periments (Figure 1G) (Lipniacki et ai., 2004; Lipniacki et ai., 
2007; Tay et ai., 201 0). Varying intrinsic noise in the modei asso- 
ciated with changes in cycie-to-cycie variabiiity, whiie changing 
extrinsic noise affected variabiiity in average (natural) oscillation 
period between ceiis (Figures 1G and 1H). Magnitudes of 
variability for simuiations matched that of experimentai mea- 
surements, supporting an appropriate baiance between intrinsic 
and extrinsic noise in the modei (Figure 1 i). 

Single-Cell NF-kB Dynamics becomes Entrained under 
an Oscillating Cytokine Signal 

During infiammation ceils operate under dynamic TNF signais, 
which may interfere with NF-kB osciiiations needed for process- 
ing of input dose information and differentiai gene expression 
(Tay et ai., 2010). Depending on frequency and ampiitude. 
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Figure 1. Noise Origins of Sustained NF-kB 
Oscillation and Heterogeneity 

(A) NF-kB transcription factor oscillates between 
cytoplasm and nucleus in response to inflamma- 
tory signals. NF-kB dynamics relay external sig- 
nals to gene expression outputs. 

(B) We deliver continuous or periodic inputs to 
cells using microfluidic cell culture. In continuous 
mode, TNF is flowed over cells to maintain con- 
stant concentration. 

(C) We record single-cell NF-kB translocation 
using live-cell microscopy, images show nucleus- 
cytoplasm oscillations in NF-kB (p65-dsRed) un- 
der continuous TNF perfusion. Scale bar, 10 j.im. 

(D) Under constant 10 ng/ml TNF concentration, 
NF-kB shows long sustaining, asynchronous 
oscillations. 

(E) NF-kB oscillates with mean period ~90 min un- 
der constant high and low dose input (n = 40 cells). 

(F) Pictorial depiction of within- and between-cell 
oscillation variability. While cells may have 
different mean periods (between-cell variability), 
each cell also exhibits fluctuation in its own 
oscillation (within-cell variability). 

(G) Simulated single cell trajectories show that 
extrinsic noise increases between-cell variability, 
and intrinsic noise increases within-cell vari- 
ability. 

(H) In simulations increasing extrinsic noise in- 
creases between-cell variability, while increasing 
intrinsic noise increases within-cell variability. 

(I) Experimentally, we observe ~12% period fluc- 
tuation between different cells and 34% within 
the same cells under 10 ng/ml TNF. Variability 
in simulated traces agrees with experimentally 
measured values. 
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periodic input to an oscillator like NF-kB can either entrain or 
disrupt the oscillation. Entrainment describes when the oscillator 
becomes phase-locked and synchronized with the driving stim- 
uli (Pikovsky et al., 2003), with prominent examples in biology 
from circadian rhythms (Leloup and Goldbeter, 2003; Reppert 
and Weaver, 2002) and brain waves, to synthetic bacterial oscil- 
lators (Mondragon-Palomino et al., 2011). On the other hand, 
when entrainment cannot occur due a significant frequency 
mismatch between the oscillator and input signal, the result is 
a disrupted oscillation that is quasiperiodic or even chaotic 
(Jensen and Krishna, 2012; Pikovsky et al., 2003). How NF-kB 
responds to sustained periodic inputs that could entrain NF-kB 
oscillations has not been experimentally investigated so far. 

To test the entrainment capacity of NF-kB and how oscillation 
contributes to gene expression control under fluctuating cyto- 



kine signals, we applied TNF inputs to 
fibroblasts using two stimulation periods 
(Figure 2A): in the first case, TNF stimulus 
is applied every 120 min, which indeed 
efficiently entrained NF-kB after a tran- 
sient (Figure 2B). In the second case, 
TNF stimulus was provided every 
60 min, which was sufficiently mis- 
matched from the ~90 min NF-kB natural 
period to induce a disrupted, non-entrained NF-kB response in 
most cells (Figure 2C). Movies S2, S3, and S4 show single-cells 
under these inputs as well as under 90 min input, and Figure SI 
shows the difference between entrained cells and cells oscil- 
lating under constant TNF signal. Images selected at three 
time points show that for 60 min input NF-kB oscillations remain 
asynchronous in the population, and for 1 20 min input NF-kB os- 
cillates synchronously across cells (Figures 2D and 2E, also see 
and Movies S2, S3, and S4). Population phase variability, a mea- 
sure of synchrony, remains high during the time course with 
60 min stimulation, but it quickly reduces during 120 min stimu- 
lation as the population entrains (Figures 2F and SI B). Simula- 
tions with our comprehensive NF-kB model also reproduced 
non-entrained and entrained responses under similar inputs 
(Figure 2G). 
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Entrained NF-kB Oscillation Improves Gene Expression 
Efficiency and Reduces Cell-to-Cell Variability 

NF-kB regulates hundreds of pro- and anti-inflammatory genes 
(Hao and Baltimore, 2013). To understand the influence of en- 
trained versus disorderly NF-kB dynamics in gene expression, 
we measured time-dependent expression of target genes for 
120 min and 60 min periodic TNF stimulation using microfluidic 
qPCR (Figures 3A and 3B, S2, and S3). Cells stimulated in 
independent chambers of the cell culture chip were harvested 
for expression analysis at 30 min time increments (Figure 3A) 
(See protocols in Kellogg et al., 2014). Under the entraining 
120 min input, gene expression output is notably enhanced, 
especially in genes with later induction times (Figure 3C, red 
lines). In contrast, 60 min input that leads to non-entrained NF- 
kB response caused an impaired transcriptional response (Fig- 
ure 3C, blue lines). Importantly, the difference in measured 
mRNA expression is not due to a difference in total TNF exposure, 
as there is greater TNF exposure for 60 min input (Figure 3D). 

We analyzed single-cell NF-kB trajectories for entrained and 
non-entrained conditions to identify what might give rise to 
the observed gene expression difference. Since differences are 
most evident in the later part of the time course, we focused 
our analysis to time after 500 min. We measured NF-kB area un- 
der the curve (AUC) and determined the extent that each trace is 
oscillatory versus non-oscillatory by power spectral analysis. 
The AUC is a measure of total NF-kB protein localization into 
the nucleus, which did not change in a significant way to explain 



Figure 2. Periodic TNF Stimulation Can 
Entrain or Disrupt NF-kB Oscillations 

(A) We deliver periodic inputs to cells using a mi- 
crofluidic cell culture chip. In periodic mode, TNF 
is replaced at specified intervals, and ligand decay 
(due to cell uptake and degradation) leads to a 
periodic sawtooth concentration profile. 

(B) Single-cell traces for stimulation at 120 min 
input period, which entrains and synchronizes NF- 
kB oscillations. 

(C) Single-cell NF-kB trajectories measured for 
stimulation at 60 min input, which disrupts NF-kB 
oscillations. 

(D) Image time-series for 1 20 min periodic input for 
times ti-t 3 indicated by arrows in B. The entrained 
cell population oscillates synchronously. Inset: 
Nucleus color indicates nuclear NF-kB intensity 
from red (low) to green (high). Scale bar, 25 rim. 

(E) Image time series for 60 min periodic input for 
times ti-ts indicated by arrows in (B). The cell 
population does not synchronize. 

(F) Phase variability for 60 min stimulation remains 
constant over time. In contrast, during 120 min 
stimulation (red), phase variability decreases as 
the cell population synchronizes over time. 

(G) Simulations reproduce non-entrained and en- 
trained responses. See also Figure S1 . 



the observed gene expression difference. 
Entrained compared to non-entrained 
cells showed only 9% increase in NF-kB 
area. However, we measured 83% in- 
crease in NF-kB oscillatory energy (Fig- 
ure 3E). Non-entraining input at 60 min mostly resulted in 
non-oscillatory localization profiles, while entraining input at 
120 min resulted in strong oscillations with large amplitude 
(Figure 3F). 

To understand how increased oscillation magnitude under en- 
training input couid lead to higher transcriptional output, we 
simulated traces for 60 and 120 min sawtooth input, similar to 
those used in experiments. Simulated NF-kB single cells ex- 
hibited similar changes in area and oscillatory energy for en- 
trained versus non-entrained conditions (Figures 3G and S2). 
However, the existing transcriptionai model, which assumed 
that NF-kB binding to DNA increases linearly with nuclear NF- 
kB concentration (Tay et al., 2010), did not reproduce increased 
gene expression for the entrained condition (Figure S2). Experi- 
ments indicate that NF-kB binds DNA cooperatively with Hill co- 
efficient (Phelps et al., 2000) (Figure S2A). Introducing this 
non-linearity in our model created significantly increased tran- 
scriptional output for entrained versus non-entrained conditions, 
in agreement with our experiments (Figures 3G and S2C) (Wee 
et al., 2012). Increasing intrinsic noise led to stronger oscillations 
and further amplified the NF-KB-induced gene expression 
(Figure 3G). Thus, entraining input leads to strengthening of 
NF-kB oscillations, which are further amplified by noise to drive 
increased transcriptionai output. 

We asked whether entrainment could reduce cell-to-cell vari- 
ability in transcriptional output in our simulations. Comparing the 
coefficient of variation of mRNA output over time indicates that 
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cell-to-cell transcription variability is significantly reduced under 
entraining input (Figure 3G). This result is consistent across 
simulated early, middle, and late-response genes (Figure S3). 
With reduced mRNA variability between individual cells, oscilla- 
tions appear even in the population averaged experimental time 
course, especially in late genes (Figure 3C). Therefore, through 
entrainment that reduces gene regulatory and gene expression 
variability between cells, one may increase response homogene- 
ity of a cell population. 

These results reveal the important role for NF-kB oscillations 
in generating efficient transcription and indicate that periodic 



Figure 3. Entrained NF-kB Oscillations 
Improve Transcriptional Efficiency 

(A) Cells are cultured and provided periodic stim- 
ulation on chip and harvested for qPCR analysis. 

(B) TNF stimulation with 60 min period (blue) leads 
to non-entrained NF-kB response, and most indi- 
vidual cells do not synchronize with the input. 

(C) NF-kB regulated gene expression under non- 
entrained (60 min stimulation) and entrained 
(120 min stimulation) conditions. Higher tran- 
scriptional output is seen when NF-kB oscillations 
are entrained. Enhanced transcription occurs 
consistently for early, middle, and late genes. The 
effect is most pronounced for late responding 
genes (i.e., cc/5). 

(D) Gene expression output measured by area 
under curve (ADC). ADC is higher for entrained 
compared to non-entrained NF-kB response. 
Although 120 min stimulation increases transcript 
production, it is not due to higher TNF exposure, 
which is lower compared to 60 min stimulation. 

(E) Analysis of single-cell NF-kB trajectories shows 
modest increase in response area (p = 0.04) and 
strong increase in oscillation energy (p = 0.0002) 
(bars indicate median ± interquartile range, 
p values by Mann-Whitney test.) 

(F) Example NF-kB trajectories (for later part of 
time course starting at 500 min) and correspond- 
ing power spectra for 60 and 120 min input (blue 
and red lines, respectively), showing stronger 
oscillation under entrained (120 min) input. 

(G) Stochastic NF-kB simulation of cells under 
either entraining or non-entraining input (left) and 
gene expression output (right). Due to non-linear 
binding of NF-kB to DNA, stronger oscillation 
under entraining input creates increased gene 
expression output, in agreement with experiments. 
Increasing intrinsic noise amplifies oscillations and 
leads to even higher transcription output. mRNA 
cell-to-cell variability (measured by Coefficient of 
Variation, CV) is lower for entrained cells, indicating 
that entrainment reduces cell-to-cell mRNA vari- 
ability compared to non-entrained cells. 

See also Figure S2 and S3. 



signaling inputs can amplify transcrip- 
tional outputs by resonantly stimulating 
oscillatory pathways like NF-kB, with a 
beneficial role for intrinsic noise in further 
improving the transcriptional output. 
Moreover, simulations and experiments 
show reduced cell-cell variability in mRNA level for entraining 
input. Therefore entrainment provides a way to increase both 
expression output and homogeneity of a cell population. 

Stochastic Modeling Shows Noise-Enhanced NF-kB 
Oscillation and Entrainment 

We next turned to simulations to evaluate the robustness of NF- 
kB entrainment to changes in the TNF input period and the influ- 
ence of noise. Using the deterministic implementation of our 
model, we simulated periodic TNF stimulation of NF-kB in single 
cells and calculated entrainment ranges. In the space spanned 
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Figure 4. Stochastic Modeling Predicts Entrainment to be Robust and that Noise Underlies Enhancement in Oscillation and Entrainment 
Range 

(A) Deterministic Arnold tongues (gray shaded regions) computed for decay-type TNF input show that entrainment is readily achieved in narrow regions around 
90 min and 180 min (Tf/Tn = 1, 2) periodic stimulation (10 ng/ml TNF). Entrainment is also possible for 30 and 45 min input (Tf/Tn = 1/3, 1/2). Locations of 
experimentally tested values are indicated by blue circles. 

(B) Input-output phase relationship and phase-locking. Phase between TNF input and NF-kB output is calculated as the distance from each NF-kB peak to the start 
of the previous TNF cycle, normalized by the input period. When phase change between cycles is less than a threshold - <|)t| < 0.15), input and output are 
considered phase-locked. Locking can occur at 1 :1 ratio (one input cycle for one output cycle), or other ratios such as 1 :2 (one input cycle for two output cycles). 

(C) Adding intrinsic noise amplifies and sustains NF-kB oscillations in the model under constant TNF input (example single cell traces are shown). 

(D) Comparison of entrainment in simulated NF-kB trajectories under low and high intrinsic noise. Under 1 20 min periodic TNF stimulation, high noise leads to an 
entrained response indicated by 120 min peak in the power spectrum. In contrast, the response under low noise is not entrained as seen in the power spectrum 
that shows weaker oscillations and only at the natural period. 

(legend continued on next page) 
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by input modulation amplitude and period (Tp), entrainment oc- 
curs in triangular regions called Arnold Tongues (Figure 4A) (Erz- 
berger et al., 2013; Jensen and Krishna, 2012). On the edges of 
Arnold Tongues synchrony between the input and oscillator 
breaks down leading to quasiperiodic or aperiodic rhythms. 
Deterministic Arnold tongues for NF-kB indicated entrainment 
principally when stimulation period is near 1:1 or 1:2 ratio with 
the natural period and (TfATn = 1,2) (Figure 4A), meaning that 
entrainment is expected when the stimulation occurs with a 
period near 90 min or near 180 min under 10 ng/ml TNF input. 

To understand the role of different noise levels in entrainment, 
we simulated periodic TNF signals with varied intrinsic and 
extrinsic noise conditions and quantified NF-kB phase locking 
by comparing phase of the next cycle to that of the current 
cycle 4)t. If the phase difference |4>t+i -4>t| is less than a threshold 
(0.1 5) then the response was considered locked over that cycle 
(Figure 4B). Our hybrid model based on Gillespie algorithm incor- 
porates experimentally verified intrinsic noise in TNF receptor- 
ligand binding, which is dominant at small TNF doses, and in 
transcription of kBa and A20 that constitute the main nega- 
tive-feedback loops leading to oscillations (Tay et al., 2010). 
Particularly, transcriptional noise arises form stochastic interac- 
tion of NF-kB transcription factors with the two copies of kBa 
and A20 genes. We reduced the transcriptional noise by 
increasing the gene copy number and proportionally reducing 
gene expression rate per copy to maintain unchanged gene 
expression and similar NF-kB natural oscillation period between 
models. With the stochastic model with greatly reduced intrinsic 
noise, entrainment occurs for narrow regions around 90 and 
180 min stimulation under high dose (10 ng/ml) TNF periodic 
input (Figures 4E and 4F), similar to those in the deterministic 
simulations. Simulations with high intrinsic noise under the 
same TNF input led to a significant broadening of the entrain- 
ment regions (Figures 4E and 4F, dotted line). The intrinsic noise 
level in these simulations was matched to the experimental level 
in Figure 1 . Fligh intrinsic noise in our simulations increased NF- 
kB oscillation amplitude of single cells and supported sustained 
oscillations needed for entrainment (Figure 4C). The power spec- 
trum provides information about entrainment, and degree of 
entrainment is indicated by the relative amount of spectral power 
at the input period. Example simulated NF-kB single-cell trajec- 
tories for 120 min input are seen in Figure 4D, showing signifi- 
cantly increased oscillation and spectral power at the input 
period for the high noise case. To determine how noise effects 
depend on TNF dose and modulation level, we simulated the 
same model using computationally efficient stochastic differen- 
tial equations, which showed that intrinsic noise improves NF-kB 
power at input period when the input modulation is smaller (i.e., 
weaker driving stimuli), as in higher-dose periodic TNF stimula- 
tion (Figure S4). 

Extrinsic noise generates cell-to-cell variability in NF-kB natu- 
ral period (Figures 1 G and 1 H). When the natural period in an 
individual cell is sufficiently close to the TNF input period. 



entrainment will occur. Extrinsic noise in the system thus in- 
creases the probability that at least a portion of cells in the pop- 
ulation will entrain to a given input. When we included extrinsic in 
addition to intrinsic noise in our simulations, we observed a 
further broadening of entrainment ranges NF-kB (Figures 4E 
and 4F). Overall, these simulations indicate that extrinsic and 
intrinsic noise together enable cells to entrain and drive efficient 
transcriptional responses for a wider range of dynamical inputs. 

NF-kB Entrainment Range Is Very Broad as Predicted by 
Noisy Simulations 

To experimentally test the robustness of NF-kB entrainment to 
changes in the input, we applied TNF inputs with 30 to 180 min 
periods to fibroblasts cultured in separate chambers of the mi- 
crofluidic system, under three different TNF doses of 10, 0.5, 
and 0.1 ng/ml (Figures 5 and S5). The dataset contains analysis 
of approximately 2,000 cells over 24hrs duration measured every 
5 min, creating more than half a million data points (Movies S2, 
S3, and S4). Fleatmaps with one row for each single-cell NF- 
kB trajectory show population synchrony that improves with 
time (Figure 5A). Periodic stimulation with reduced dose leads 
to even better entrainment (Figure 5A). The fraction of NF-kB cy- 
cles locking to different entrainment ratios was computed for 
each stimulation condition, and as anticipated 1:1 locking is 
maximized when then stimulation period is near 90 min and 1 :2 
locking is maximized for 1 80 min stimulation (Figure 5B). Surpris- 
ingly, we observed cells having entrained oscillations in every 
input period tested, even in those inputs like 120 min that are 
not predicted by the deterministic or low noise simulations. 
Good agreement is seen between experimental entrainment 
values and high-noise model simulations (both under 10 ng/ml 
TNF dose) incorporating both extrinsic and intrinsic noise 
(Figure 5B). We did not observe dependence on cell density 
(Figure S6). 

A consequence of natural period diversity is entrainment het- 
erogeneity, including the ability for different cells in the popula- 
tion to entrain at different ratios. Cells entrained at multiple ratios 
or did not entrain and exhibited quasiperiodic oscillation (Fig- 
ure 6C). Period probability follows a multimodal distribution, 
indicating simultaneous mixture of for example 1:1 and 1:2 
locking responses in the population (Figure 6A). Simulations 
incorporating only extrinsic noise also generate mixed locking 
responses, indicating that locking heterogeneity can arise from 
extrinsic noise (Figure 6D). 

Period distributions show narrowing with reduced TNF dose, 
supporting more effective entrainment (Figure 6B). Comparing 
mean pairwise Spearman correlation for population responses 
at each input revealed increased correlation as dose decreases 
and at larger input periods (Figure 6E). Therefore, NF-kB is more 
amenable to entrainment for input level in middle of its dose dy- 
namic range (Tay et al., 201 0) in agreement with simulation (Fig- 
ure S5C). Entrainment is more efficient under input periods larger 
than 60 min, and frequencies larger than 0.02 min“^ (50 min 



(E and F) Entrainment simulation: Under low-noise simulation (extrinsic noise off, intrinsic noise reduced), entrainment occurs for narrow regions around 90 min 
stimulation period (1 :1 entrainment, E) and around 180 min period (1 :2 entrainment, F). Increasing intrinsic noise in the model broadens regions of entrainment 
(dotted line). Extrinsic noise alone also increases entrainment range for the population. Adding both extrinsic and intrinsic noise further expands entrainment. 
See also Figure S4. 
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Figure 5. NF-kB Entrainment Range Is Wide and Agrees with Noisy Model Predictions 

(A) Heatmaps of single-cell NF-kB trajectories under different doses of TNF (10, 0.5, and 0.1 ng/ml) for stimulation periods ranging from 30 to 180 min. Color 
indicates NF-kB intensity from low (blue) to high (red). Entrainment of individual cells can be visualized with the appearance of well-aligned peaks. Entrainment 
and synchronization is pronounced for 90 and 180 min stimulation. Reduced dose leads to greater oscillation synchrony and improved entrainment across all 
stimulation periods (Figure S5B). 

(B) Comparison of entrainment scores for 1 0 ng/ml stimulation in various locking modes shows agreement between experiments and noisy model prediction. See 
also Figure S5. 



period) are not observed in the singie ceii power spectra, indi- 
cating that NF-kB system acts iike a fiiter that prevents transmit- 
tance of rapid TNF input fiuctuations into transcription. 

DISCUSSION 

Here, we provide insight into the function of transcription factor 
dynamics and noise in gene expression controi under fiuctuating 
signaling inputs. Sustained, heterogeneous singie-ceii NF-kB 
osciiiations synchronize to an osciiiating TNF signai in a wide 
range of stimuiation frequencies and become entrained. Entrain- 
ment causes ampiification of NF-kB osciiiations and increased 
gene expression (Figure 7). Simuiations predict that both intrinsic 
and extrinsic noise can improve NF-kB entrainment range, 
aiiowing ceiis to respond synchronousiy to broader range of 
infiammatory signais (Figure 4). Single-ceii measurements con- 
firmed that indeed NF-kB entrainment occurs in the broad range 
as predicted by stochastic modeiing (Figure 5). Whiie extrinsic 
noise ieads to differences in osciiiation frequency between ceiis 



creating heterogeneous iocking behavior and increases entrain- 
ment robustness of the popuiation to changes in input period, a 
surprising finding is the beneficiai role for intrinsic noise in 
dynamicai signaiing: molecuiar fiuctuation arising from iow 
copy-number feedback transcripts (IkB and A20) can act to 
enhance NF-kB osciiiation and expand the range of inputs that 
entrain NF-kB and uitimateiy enhance target gene expression 
(Figure 7). Increased gene expression was expiained by incorpo- 
rating data on non-iinear NF-kB-DNA binding affinity into the 
modei, and we see the greatest differentiai reguiation for iate 
genes such as Cci5 in agreement with findings that iate genes 
are more sensitive to osciiiatory reguiation (Ashaii et ai., 2009; 
Weeetai., 2012). 

Together, our resuits describe important functions for osciiia- 
tion and noise in signaiing networks. Transcription factor oscii- 
iation aiiows ampiified pathway output in response to a periodic 
stimuius and thus increases system efficiency by reducing the 
amount of input signai needed to generate strong response. 
Osciiiation moreover aiiows controi of heterogeneity through 
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Figure 6. Population Heterogeneity and Dose Dependence of NF-kB Entrainment 

(A) Period probability distributions for 10 ng/ml TNF input reveal entrainment at multiple ratios between the input and output period. Under 90 min input, the 
population entrains nearly homogeneously with a 90 min phase-locked oscillation (1 :1 ratio, red line). In contrast, during 1 50 min stimulation cells may respond 
with a 150 min oscillation (1 :1 ratio, red line), or a 75 min osciliation (1 :2 ratio, blue line), or without phase-locking (orange line). 

(B) Period distributions for multiple TNF concentrations. Lower concentration leads to period distribution narrowing, indicating improved entrainment and 
reduced cell-to-cell variability. 

(C) Measured single-cell NF-kB traces and power spectra for each locking ratio and an example quasiperiodic response (Not locked). 

(D) Simulation with extrinsic noise shows that different locking ratios may occur simultaneously (blue line -1:1 locking, red and green lines -1:2 locking). 

(E) Mean pairwise spearman correlation in NF-kB dynamics indicating better population entrainment at lower input concentration and at higher input periods. 
The NF-kB system efficiently filters rapid input fluctuations with periods shorter than 50 min. 

See also Figure S6. 



synchronization of gene reguiatory dynamics across the popu- 
iation. By enhancing osciiiation and entrainment bandwidth, 
noise faciiitates efficient transcription in dynamic signaiing 
contexts. 



Cytokines iike TNF activate muitipie signaiing pathways, and 
resonant pathway stimuiation provides a way to achieve specific 
responses. A iow-dose signai, deiivered periodicaiiy, couid 
excite NF-kB osciiiations and activate NF-kB signaiing whiie 
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Figure 7. Role of Intrinsic and Extrinsic 
Noise in NF~kB Entrainment and Enhanced 
Gene Expression 

(A and B) Entrainment score for different inputs 
shown on the left side; single-cell NF-kB time 
course shown in the middle; and the correspond- 
ing mRNA output is shown on the right. (A) Black 
curve on the left shows the entrainment range of a 
given cell with intrinsic noise, and dashed gray 
curve shows the narrower noise-free entrainment 
range. Signaling inputs at the edge of the 
entrainment range cause non-entrained NF-kB 
responses and small amplitude (in blue), resulting 
in impaired gene expression output. Intrinsic noise 
improves the amplitude and the regularity of NF- 
kB oscillations (in red), resulting in increased gene 
expression output. Intrinsic noise can increase 
entrainment score and also the bandwidth, where 
cells entrain to a broader range of input periods. 
(B) Extrinsic noise creates cell-to-cell variability in 
the entrainment range, resulting in a broader 
entrainment bandwidth for the population. Popu- 
lation variability in entrainment potential ensures 
that at least some cells will entrain under a given 
input period. 



avoiding activation of non-osciiiatory pathways (such as AP-1). 
Entrainment with resonance aiso aiiows more efficient communi- 
cation. Indeed, we show that a periodic resonant stimuius 
achieves greater pathway output whiie at the same time 
requiring fewer TNF moiecuies than a non-entraining stimuius. 

Osciiiation with resonance may act as a fiiter. Non-entraining 
inputs like rapid TNF fiuctuations are effectiveiy attenuated at 
the gene expression ievel. This may ailow NF-kB system to fiiter 
out fast cytokine fiuctuations that are not physioiogicai (i.e., input 
noise). Whiie NF-kB exhibits a robust natural period of ~90 min, 
researchers are finding osciiiation in many signaling pathways 
with differing characteristic frequencies. Therefore the pathway 
specificity of a pieiotropic factor such as TNF might be tuned 
by changing the frequency with which it stimuiates a celi. None- 
theiess, it is iikely that osciiiatory pathways are linked within and 
between celis more than is currentiy appreciated (Kupzig et ai., 
2005), and temporai fiitering aiiows ceiis to achieve specific re- 
sponses based on the frequency content of input signais. 

NF-kB both responds to and drives cytokine production, and 
osciiiatory cytokine production has been observed in activated 
singie T ceils (Han et al., 2012). Entrainment of NF-kB couid be 
a coordination mechanism during infection, by controiiing para- 
crine signais that instruct migration or fate determination of im- 
mune celis (Yde et al., 201 1). TNF-positive feedback in secretory 
immune ceiis such as macrophages couid improve entrainment 
at higher celi density, creating a more amplified (and more homo- 
geneous) response (Pekaiski et ai . , 201 3). The broad entrainment 
range of the NF-kB system aiiows ceiis to adapt their osciiiation 
frequency and gene expression dynamics to match cytokine 
fiuctuation in the environment. 

Our findings suggest a surprising roie for noise and osciiiation 
in mammalian signai transduction and transcriptionai controi 
(Figure 7). in dynamic, physioiogicai signaling scenarios osciiia- 
tion provides ceiis the abiiity to decode not oniy the amplitude 



but aiso frequency content of input signals. Inputs occurring 
near the natural frequency of an oscillatory system are amplified 
and generate higher gene expression output, while other input 
frequencies generate an attenuated response. By enhancing 
oscillation and entrainment at small signal modulation noise 
may improve the transfer of weaker dynamic signals in the NF- 
kB system. Entrainment allows efficient cell-cell communication, 
control of cell-cell heterogeneity, and possibility to selectively 
activate oscillatory pathways through resonant stimulation. The 
prevalence of oscillation in signaling networks suggests that 
cells are well-equipped for processing dynamic signals. 

EXPERIMENTAL PROCEDURES 

TNF-ot Stimulation Using Microfiuidic Ceii Cuiture 

We use the cell culture chip described previously (Gomez-Sjdberg et al., 2007). 
Cells were seeded In PDMS chambers coated with fibronectin at constant 
density ~20,000 cells/cm^ and were cultured overnight prior to stimulation. 
Standard culture conditions of 5% CO 2 and 37°C were maintained using an in- 
cubation chamber. Mouse TNF-a (Invitrogen) was diluted in DMEM media in 
vials pressured with 5% CO 2 and kept on ice. Microbore tubing (PEEK, Idex) 
connected the TNF-a supply to the chip. For continuous pumping input, the 
on-chip peristaltic pump was operated at a flow rate ~200 nl/min. For periodic 
input, TNF-a containing media was introduced and incubated in the chamber, 
allowing degradation and internalization of the ligand. The chamber volume is 
replaced with fresh TNF-a containing media at defined intervals, leading to pe- 
riodic sawtooth pattern in ligand concentration. 

Cell Retrieval and Gene Expression Analysis 

Cells were loaded into the cell culture chip, and a Matlab program delivered 
60 min or 120 min periodic inputs with start times staggered by 30 min to 
generate time points from 0 to 23.5 hr. Cells in one chamber (approximately 
200 cells) were retrieved for each time point. At the conclusion of stimulation, 
cells in all chambers were lysed at once on-chip, and retrieved in a 2 nl volume 
of lysis buffer using an automated routine. Cells exited the chip through 
~10 cm length microbore tubing positioned into wells of a 96-well plate. 
Wash steps using PBS prior to retrieval prevented cross-contamination of 



390 Cell 160 , 381-392, January 29, 2015 ©2015 Elsevier Inc. 




Cell 



chambers. cDNA was synthesized using Cells Direct One Step RT-PCR kit (In- 
vitrogen). TaqMan primers and probes (Applied Biosystems) were used for 
real-time qPCR. Gene expression was assayed using the 48.48 Dynamic Array 
IFC chip (Fluidigm). Cycle thresholds (CT) were converted to relative expres- 
sion values normalized to GAPDH - ct_,genej jq^^i expression abun- 

dance was calculated as the integral of the relative expression (using the 
Matlab trapz function). 

Image Acquisition and Data Processing 

The microfluidic chip was mounted on an automated Leica DMI6000B micro- 
scope, and fluorescence images (red and green channels for p65 and H2B re- 
porters, respectively) were acquired at 20 x magnification via a Retiga-SRV 
CCD camera (Qlmaging) every 5-6 min for 24-48 hr. CellProfiler software 
(www.cellprofiler.org) and custom Matlab routines (Gomez-Sjdberg et al., 
2007) were used for image processing (available on request). NF-kB activation 
was quantified as mean nuclear fluorescence intensity after background 
correction. Area-under-curve provides a measure of total NF-kB activity (Tay 
et al., 2010) and was quantified as the integral of the NF-kB response (using 
Matlab trapz). For peak analysis and heatmaps data were smoothed and stan- 
dardized (Matlab functions smooth and zscore) followed by peak detection 
(Matlab mspeaks). Peak-to-peak distances were computed as the difference 
between peak times (Matlab diff). Cell image overlays aided visualization of 
oscillation peaks (colored green) and troughs (colored red). 

NF-kB Reporter Cell Line 

Mouse (3T3) fibroblasts expressing near-endogenous p65 levels were 
described previously (Tay et al., 2010). Briefly, p65“^“ mouse 3T3 fibroblasts 
were engineered to express p65-DsRed under control of 1 .5 kb p65 promoter 
sequence (Lee et al., 2009; Tay et al., 2010). A clone was selected with mini- 
mum detectable fluorescence intensity to achieve near-endogenous expres- 
sion level and NF-kB dynamics similar to wild-type (Lee et al., 2009). Addition 
of ubiquitin-promoter driven H2B-GFP expression provided a nuclear label to 
facilitate automated tracking and image processing. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, six 
figures, and four movies and can be found with this article online at http:// 
dx.doi.org/1 0.1 01 6/j.cell.201 5.01 .01 3. 
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SUMMARY 

Colorectal cancer primarily metastasizes to the liver 
and globally kills over 600,000 people annually. 
By functionally screening 661 microRNAs (miRNAs) 
in parallel during liver colonization, we have identi- 
fied miR-551a and miR-483 as robust endogenous 
suppressors of liver colonization and metastasis. 
These miRNAs convergently target creatine kinase, 
brain-type (CKB), which phosphorylates the metabo- 
lite creatine, to generate phosphocreatine. CKB is 
released into the extracellular space by metastatic 
cells encountering hepatic hypoxia and catalyzes 
production of phosphocreatine, which is imported 
through the SLC6A8 transporter and used to 
generate ATP — fueling metastatic survival. Combi- 
natorial therapeutic viral delivery of miR-551a and 
miR-483-5p through single-dose adeno-associated 
viral (AAV) delivery significantly suppressed colon 
cancer metastasis, as did CKB inhibition with a 
small-molecule inhibitor. Importantly, human liver 
metastases express higher CKB and SLC6A8 levels 
and reduced miR-551 a/miR-483 levels relative to pri- 
mary tumors. We identify the extracellular space as 
an important compartment for malignant energetic 
catalysis and therapeutic targeting. 

INTRODUCTION 

Colorectal cancer is the third leading cause of mortality in the 
United States and a major cause of death globally (Davis and 
Schlessinger, 2012; Jemal et al., 2011; Siegel et al., 2014). Death 
from colorectal cancer Is primarily due to the metastatic progres- 
sion, with the liver being the organ of metastatic colonization in 
over 70% of patients. To date, efforts aimed at increasing cure 
rates after surgery have focused on combination chemotherapy 
administration as a means of preventing metastasis. Such ther- 
apy reduces metastatic relapse by roughly 7% (Meyerhardt and 
Mayer, 2005). The high prevalence of this disease and the lack of 
effective adjuvant therapeutics demand a greater understanding 

CrossMark 



of the biology of Its progression (Markowitz and Bertagnolll, 
2009). 

In recent years, posttranscriptional deregulation has emerged 
as a key feature of metastatic cells. In particular, specific micro- 
RNAs (miRNAs), which are small noncoding RNAs, have been 
Identified that are silenced or overexpressed and act to suppress 
or promote metastatic progression by diverse cancer types (Lu- 
jambio and Lowe, 2012; Maetal., 2007; Pencheva and Tavazoie, 
201 3; Pencheva et al., 201 2; Tavazoie et al., 2008). While the use 
of these miRNAs as molecular probes for the identification of 
metastasis regulators has proved fruitful, their therapeutic utility 
has been limited given the inefficient delivery of miRNAs into 
various metastatic tissues. Interestingly, the liver represents an 
exception to this rule, because miRNAs tend to accumulate In 
hepatic tissue and because vectors such as adeno-associated 
viruses and nanoparticles have shown promising efficacy in 
enhancing hepatic delivery in nonhuman primates and humans 
(Kota et al., 2009; MIngozzI and High, 2011). Given this unique 
feature of the liver as well as the great need for targeted therapies 
that can suppress liver metastatic colonization by colon cancer, 
the Identification of miRNAs that could suppress liver metastasis 
would be of great clinical value. 

By screening 661 human miRNAs In parallel for their ability to 
suppress the colonization of the liver by multiple colon cancer 
cell lines representing diverse mutational subtypes, we have 
Identified mlR-551 and mlR-483 as endogenous suppressors 
of colon cancer metastasis. We find that these miRNAs both 
target Creatine kinase Brain (CKB). Disseminated metastatic 
cells release this enzyme Into the extracellular space, where It 
catalyzes the phosphorylation of the metabolite creatine by us- 
ing extracellular ATP as the phosphate source. Phosphocreatine 
Is then Imported into disseminated colorectal cancer cells where 
Its high-energy phosphate Is used to generate Intracellular ATP 
that sustains the energetic requirements of colon cancer cells 
encountering hepatic hypoxia, allowing them to survive this bar- 
rier to metastatic progression. Therapeutic viral delivery of these 
miRNAs to the liver and disseminated colon cancer cells via ad- 
eno-associated viral delivery strongly suppresses metastatic 
colonization by colon cancer cells. Moreover, small-molecule 
therapeutic inhibition of CKB activity also suppresses metastatic 
growth. Our findings delineate a druggable molecular network 
that governs both the metabolic state and the metastatic pro- 
gression capacity of disseminated colon cancer cells. More 
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importantly, we implicate the extracellular space as a previously 
unrecognized environment for malignant catalysis and identify 
CKB as a secreted metabolic kinase that drives cancer 
progression. 

RESULTS 

Endogenous miR-483-5p and miR-551a Suppress 
Human Colorectal Cancer Metastasis 

In vivo selection has been used by many investigators to identify 
candidate genes that regulate metastatic progression of diverse 
cancer types. This approach allows one to derive highly metasta- 
tic subpopulations with enhanced metastatic activity for a given 
organ (Fidler, 1 973). The comparison of transcriptomic profiles of 
metastatic derivatives to the parental lines from which that they 
were derived has revealed numerous candidate genes for func- 
tional testing (Bruns et al., 1999; Kang et al., 2003; Minn et al., 
2005; Pencheva et al., 2012; Png et al., 2012; Tavazoie et al., 
2008). As a first step to identify the molecular regulators of liver 
colonization by colon cancer cells, we performed in vivo selec- 
tion on the LS-174T (K-Ras mutant) human colon cancer line 
for enhanced liver colonization activity through iterative intra-he- 
patic injection of cancer cells into immunodeficient mice fol- 
lowed by surgical resection of liver colonies and dissociation of 
cells. Independently derived third-generation liver colonizers 
LS-LvM3a and LS-LvM3b displayed significantly enhanced 
(>50-fold) capacity for liver colonization upon intrahepatic injec- 
tion relative to their parental line (Figure 1A). Importantly, these 
derivatives also displayed dramatically enhanced (>1 50-fold) 
liver metastatic capacity upon portal circulation injection in 
metastasis assays (Figure S1A available online)— revealing the 
acquisition of liver colonization capacity to be sufficient for im- 
parting enhanced liver metastasis activity. As an orthogonal 
approach, we transduced a library of lentiviral particles, each en- 
coding one of 661 human miRNAs, into two independent colon 
cancer cell lines— the WiDR (K-Ras wild-type) and SW620 (K- 
Ras mutant) human lines. These cancer populations, containing 
cancer cells expressing each of 661 miRNAs, were then intrahe- 
patically injected into mice to allow for selection of cells capable 
of colonizing the liver. Genomic PCR amplification of lentiviral- 
derived mlRNA sequences and miRNA profiling of miRNA inserts 



allowed for the quantification of miRNA insert representation 
(Figure IB; Table SI). We identified miRNAs that displayed 
reduced representation in the context of liver colonization in 
both colon cancer cell-lines on the basis that overexpression 
of these miRNAs suppressed liver colonization by colon cancer 
cells. We next asked whether endogenous forms of these miR- 
NAs exhibited silencing in highly metastatic derivatives relative 
to isogenic poorly metastatic parental cells. Indeed, two of the 
miRNAs, miR-483-5p and miR-551a, were found to be silenced 
in highly metastatic LS-LVM3a and LS-LVM3b liver colonizers 
relative to their parental line (Figure SIB; Table S2). Consistent 
with a suppressive role for these miRNAs in liver colonization, 
overexpression of miR-483-5p or mlR-551 a robustly suppressed 
metastatic colonization by LS-LvM3b cells introduced into the 
portal circulation (Figures 1 C and SI C), while inhibition of endog- 
enous miR-483-5p or miR-551a in poorly metastatic parental 
lines SW480 and LS-1 74T significantly enhanced liver metastatic 
colonization (Figures 1 D and SI D). The effects of these miRNAs 
on metastatic progression were not secondary to modulation of 
proliferative capacity because miR-551 a inhibition did not affect 
in vitro proliferation, while miR-483-5p inhibition minimally 
increased proliferation (10%)— an order of magnitude less than 
its effect on metastasis (Figure S1E). Importantly, overexpres- 
sion of either miRNA did not suppress primary tumor growth 
(Figure SI F). 

To better investigate the mechanism(s) by which these miR- 
NAs exert their anti-metastatic effects, we employed an in vitro 
liver organotypic slice culture system to study early events dur- 
ing liver metastasis subsequent to single-cell dissemination of 
colon cancer cells in the liver microenvironment (Figure S1G). 
Consistent with prior studies, which revealed a significant selec- 
tion on cell survival during metastatic colonization (Gupta and 
Massague, 2006; Talmadge and Fidler, 2010), we noted that 
highly metastatic LvM3b cells were significantly better at persist- 
ing in the liver microenvironment than their poorly metastatic 
parental line; consistent with a key role for intrahepatic persis- 
tence in metastatic progression (Figure SI FI). We examined 
whether the enhanced capacity of metastatic cells to persist in 
the hepatic microenvironment is regulated by mlR-483-5p or 
miR-551 a. Indeed, overexpression of mlR-483-5p and miR- 
551 a in LS-LvM3b cells suppressed (Figures SI I and SI J), while 



Figure 1. miR-483-5p and miR-551a Are Endogenous miRNAs that Suppress Liver Metastasis 

(A) Bioluminescence plot of liver colonization by 5 x 1 0^ LS-Parental, LvM3a, and LvM3b cells after direct intrahepatic injection (n > 5). Mice were imaged at day 
21 after injection and iivers extracted for ex vivo imaging and gross morphological examination. Photon flux ratio is the ratio of bioluminescence signal at day 21 
normalized to signal on day 0. 

(B) Schematic for the identification of miR-483-5p and miR-551 a as suppressors of metastasis. 

(C) Liver metastasis of mice injected with 5x10^ LvM3b cells overexpressing either a control hairpin, miR-483-5p, or miR-551 a (n > 5). 

(D) Liver metastasis in mice injected with 5x10^ SW480 cells, whose endogenous mlR-483-5p or miR-551 a was inhibited (n > 5). 

(E and F) Organotypic slice culture imaging of SW480 cells (n = 8) whose endogenous miR-483-5p (E) or miR-551 a (F) were inhibited by LNAs. Cells (5x1 0^) were 
labeled with cell-tracker green (control LNA) or cell-tracker red (mlRNA-specific LNA) and introduced into the livers prior to slice culture. Dye-swap experiments 
were performed to compensate for dye bias. Representative images at day 0 and day 3 are shown. Total area of each cell population at indicated time points are 
measured and normalized to start of experiment. Scale bar represents 50 nm. 

(G) Bioluminescent metastatic signal from mice (n = 5) injected with 5x10® SW480 cells whose endogenous miR-483-5p or miR-551 a activities were inhibited. 
Images and measurements were taken 24 hr after tumor cells inoculation. 

(H) Relative in vivo caspase activity of SW480 cells whose endogenous miR-483-5p or miR-551 a was inhibited (n = 3). Caspase activity was monitored using 
a caspase-3/7-activated DEVD-luciferin and normalized with bioluminescent signal from regular luciferin. Error bars represent SEM; all p values are based on 
one-sided Student's t tests, or where appropriate, Mann-Whitney test for non-Gaussian distribution. *p < 0.05; **p < 0.01 ; ***p < 0.001 . 

See also Figure SI and Tables SI and S2. 
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inhibition of these miRNAs enhanced (Figures 1E and 1F) coion 
cancer persistence in the hepatic microenvironment, in agree- 
ment with our organotypic siice cuiture findings, we noted that 
as eariy as 24 hr after injection of ceiis into the portai circuiation, 
celis whose endogenous miRNAs were inhibited out-competed 
controi ceiis (Figure 1 G). As neither of these miRNAs significantiy 
affected proiiferation, we asked if they eiicited their effects by 
suppressing cancer celi survivai during metastatic progression. 
To quantify ceii death in vivo, we utiiized a bioiuminescence- 
based iuciferin reporter of caspase-3/7 activity (Fiickson et ai., 
2010). MiRNA inhibition significantiy reduced in vivo caspase 
activity in coion cancer ceiis during the eariy phase of hepatic 
coionization (Figure 1H), revealing cancer survivai to be the 
phenotype suppressed by these miRNAs. These in vivo findings 
provide corroboration and a mechanistic basis for the organo- 
typic siice cuiture observations. Our findings reveai miR-483 
and miR-551a to suppress iiver metastatic coionization through 
suppression of metastatic ceii survivai in the iiver— a phenotype 
exhibited by highiy metastatic coion cancer ceils. 

miR-483-5p and miR-551a Suppress Colorectal Cancer 
Cell Survival and Metastasis in the Liver through 
Targeting of CKB 

We next sought to identify the downstream effectors of these 
miRNAs. Through transcriptomic profiiing, we identified tran- 
scripts that were downregulated by overexpression of each 
miRNA and that contained 3'-UTR or coding-sequence elements 
complementary to the miRNAs. Interestingly, CKB was identified 
as a putative target of both miRNAs, suggesting that these miR- 
NAs, which exhibit common in vivo and organotypic phenotypes, 
might mediate their effects through a common target gene (Table 
S3). Importantly, endogenous miR-483 and miR-551a were 
found to suppress CKB protein levels (Figure 2A). Quantitative 
RT-PCR validation also revealed suppression or upregulation 
of CKB transcript levels upon overexpression or inhibition of 
the miRNAs, respectively (Figures S2A and S2B). Mutagenesis 
and luciferase-based reporter assays revealed miR-483 to 
directly target the 3'UTR and miR-551 a to directly target the cod- 
ing region of CKB (Figures S2C and S2D). Overexpression of 
CKB in poorly metastatic SW480 cells was sufficient to signifi- 
cantly enhance liver metastasis (>3-fold; Figure 2B), while CKB 
knockdown in metastatic LS-LvM3b cells and SW480 cells, 
through the use of two independent hairpins for each line, 
robustly suppressed liver metastatic colonization (>5-fold; Fig- 
ures 2C and S2E). Importantly, metastases that grew out in 
knockdown experiments had “escaped” small hairpin RNA 
(shRNA) knockdown and displayed restored CKB expression 
(Figure S2F). Consistent with the effects of its regulatory miR- 
NAs, CKB overexpression was sufficient to significantly enhance 
the ability of colon cancer cells to persist in the liver micro- 
environment and enhanced their representation in the liver 
(Figure 2D), while CKB knockdown substantially reduced intra- 
hepatic persistence (Figure 2E). Consistent with this, CKB over- 
expression reduced (Figure 2F), while CKB knockdown signifi- 
cantly enhanced (Figure 2G), in vivo caspase-3/7 activity in colon 
cancer cells during the initial phase of hepatic colonization. To 
investigate whether CKB acts directly downstream of miR-483- 
5p and miR-551 a, we performed gain-of-function, loss-of-func- 



tion, and epistasis studies. Knockdown of CKB in cells inhibited 
for miR-483-5p or miR-551 a prevented the enhanced metastasis 
effect seen with miR-483-5p or miR-551 a silencing (Figure 2H). 
Conversely overexpression of CKB was sufficient to rescue the 
suppressed liver metastatic phenotypes of cells overexpressing 
miR-483-5p or miR-551 a (Figure S2G). The results of the above 
mutational, gain-of-function, loss-of-function, and epistasis ana- 
lyses reveal CKB to be a direct target of mlR-483-5p and miR- 
551 a, to act as a downstream effector of these miRNAs in the 
regulation of colon cancer metastatic progression, and be a 
promoter of colon cancer survival during hepatic metastatic 
colonization. 

CKB Promotes Colorectal Cancer Cell Survival during 
Acute Intrahepatic Hypoxia through Modulation of the 
Phosphocreatine/ATP Shuttle 

CKB is known to regulate the reservoir of rapidly mobilized high- 
energy phosphates in tissues such as the brain by catalyzing 
the transfer of a high-energy phosphate group from ATP to the 
metabolite creatine, yielding phosphocreatine (Wyss and 
Kaddurah-Daouk, 2000). Recent studies have implicated the 
involvement of metabolic pathways in tumorigenesis and cancer 
progression (Cairns et al., 2011; Christofk et al., 2008; Dang 
et al., 2009; Kaelin and McKnight, 2013). The maintenance of 
intracellular ATP levels is also critical for cancer cell survival un- 
der metabolic stress (Jeon et al., 2012). Flighly metabolic cells 
maintain phosphocreatine stores in order to buffer against low 
ATP states, because phosphocreatine’s high-energy phosphate 
can be transferred to ADP to generate ATP (Wallimann et al., 
1992; Wyss and Kaddurah-Daouk, 2000). Consistent with this, 
overexpression and knockdown of CKB in colon cancer cells 
increased and decreased, respectively, intracellular phospho- 
creatine levels (Figure 3A) and CKB depletion resulted in 
decreased ATP levels that could be rescued by phosphocreatine 
supplementation (Figure 3B). Consistent with our findings that 
miR-483 and miR-551 a regulate CKB expression, modulation 
of either of these miRNAs also modulated intracellular phospho- 
creatine (Figures S3A and S3B) and ATP levels (Figures S3C and 
S3D). What purpose could CKB-generated phosphocreatine and 
ATP play during colon cancer metastatic progression? The liver 
microenvironment is known to contain hypoxic regions, with 
metabolically active hepatocytes at the periportal region display- 
ing high rates of oxygen consumption and hepatocytes at the 
perivenous region actively undergoing glycolysis (Arteel et al., 
1995; Jungermann and Kietzmann, 2000). Additionally, colon 
cancer cells metastasize to the liver via the portal circulation, 
which is relatively hypoxemic. We hypothesized that colorectal 
cancer cells experience acute hypoxia and intense competition 
for glycolytic substrates during initial dissemination to the liver 
and could be poorly adapted to the liver microenvironment prior 
to HIF-activated responses (Semenza, 2011). ATP generated 
from rapid utilization of intracellular phosphocreatine reservoirs 
might enable colon cancer cells to survive acute hepatic hypox- 
ia. To determine if colon cancer cells experience hypoxia during 
early metastatic colonization, we utilized a Hif-1 alpha transcrip- 
tional luciferase-reporter (FIRE-Luc) as an in vivo sensor and 
reporter of cellular hypoxia (Figure S3E) and observed that 
colon cancer cells experience hypoxia early after hepatic 
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Figure 2. miR-483-5p and miR-551a Suppress Colorectal Cancer Cells Survival and Metastasis in the Liver through Regulation of CKB 

(A) Expression of CKB in SW480 cells whose endogenous miR-483-5p or miR-551a was inhibited with LNAs. 

(B) Liver metastasis in mice injected intrasplenically with 5x10® control SW480 cells or CKB overexpressing SW480 cells (n = 5). Mice were euthanized at 28 days 
after injection. 

(C) Liver metastasis in mice injected intrasplenically with 5x10® LvM3b expressing a control hairpin or two independent shRNA hairpins targeting CKB (n = 6). 
Mice were euthanized 21 days after injection. 

(D) Survival of control SW480 and CKB overexpressing SW480 cells in organotypic liver slices (n = 8). 

(E) Survival of LvM3b cells expressing a control hairpin or hairpin targeting CKB in organotypic slice cultures (n = 8). Representative images at day 0 and day 2 are 
shown. Scale bar represents 50 ^lm. 

(F) Relative in vivo caspase activity of control or CKB overexpressing SW480 cells. Caspase activity was measured on days 1 , 4, and 7 after injection (n = 3). 

(G) Relative in vivo caspase-3 activity of SW480 cells expressing a control shRNA or shRNA targeting CKB. Caspase activity was measured on days 1 , 4, and 7 
after injection (n = 3). 

(H) Liver metastasis in mice injected with 5x10® SW480 cells whose endogenous miR-483-5p or miR-551a was inhibited by LNA, with and without CKB 
knockdown. Error bars represent SEM; all p values are based on one-sided Student’s t tests, or where appropriate, Mann-Whitney test for non-Gaussian dis- 
tribution. *p < 0.05; **p < 0.01; ***p < 0.001. 

See also Figure S2 and Table S4. 
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Figure 3. CKB Modulates Colorectal Cancer Cell Survival during 
Acute Intrahepatic Hypoxia through Modulation of the Phospho- 
creatine/ATP Shuttle 

(A) Relative intracellular phosphocreatine levels in SW480 cells overexpressing 
CKB or depleted for CKB (n = 5). 

(B) Relative intracellular ATP levels in LvM3b cells depleted for CKB, with and 
without exogenous 10 urn phosphocreatine supplementation (n = 5). 

(C) In vivo caspase activity of control and CKB knockdown SW480 cells 
experiencing hypoxia within the livers of mice (n = 3). 

(D) Survival of colorectal cancer cells in hypoxia in vitro with and without CKB 
knockdown and 10 rim phosphocreatine supplementation (n = 3). 

(E) Liver metastasis by CKB depleted LvlVI3b cells with and without overnight 
preinoubation with 10 riM phosphocreatine. Cells (5 x 10^) were then inocu- 
lated into the liver of mice through intrasplenic injection. 

(F) Liver metastasis in mice injected with 5x10^ LvM3b cells with and without 
pretreatment with 1 0 mM oyclocreatine for 48 hr. Error bars represent SEM; all 
p values are based on one-sided Student's t tests, or where appropriate, 
Mann-Whitney test for non-Gaussian distribution. *p < 0.05; **p < 0.01; 

***p< 0.001. 

See also Figure S3. 



dissemination (Figure S3F). We found that CKB depietion 
increased caspase-mediated celi death in FIRE-Luc expressing 
cells experiencing hypoxia in vivo (Figure 3C). Conversely, inhi- 
bition of either miR-483-5p or miR-551a protected FIRE-Luc 
expressing cells experiencing hypoxia in vivo (Figure S3G). 
Consistent with a role for CKB and phosphocreatine in promot- 
ing cancer-cell survival during hypoxia, cells depleted of CKB 
through RNAi displayed reduced survival while experiencing 
hypoxia in vitro— an effect that was abrogated upon phospho- 
creatine supplementation (Figure 3D). In agreement with our 
in vitro findings, preincubation of colon cancer cells depleted 
of CKB with phosphocreatine enhanced their ability to metasta- 
size to the liver (Figure 3E). Conversely, liver metastasis was 
inhibited when we preincubated colon cancer cells with cyclo- 
creatine (Lillie et al., 1993), an inhibitor of CKB that depletes 
phosphocreatine levels (Figure 3F). Our findings suggest that he- 
patic hypoxia poses a barrier for colon cancer cells during early 
metastatic colonization and that cells endure this phase through 
the generation of ATP from phosphocreatine reserves. Indeed, 
the ability of phosphocreatine preloading to enhance metastasis 
in vivo supports the importance of the acute initial hypoxic barrier 
and energetic demands in shaping metastatic colonization by 
cancer cells as they enter the liver microenvironment. 

CKB Is Secreted by Colorectal Cancer Cells and 
Promotes Malignant Conversion of Extracellular ATP 
and Creatine to Phosphocreatine 

While considering CKB’s role in the setting of the hypoxic hepat- 
ic microenvironment that colorectal cancer cells arrive into as 
they disseminate from the gut to the liver, we faced a conun- 
drum: how can colon cancer cells arriving and residing in a hyp- 
oxic hepatic microenvironment replenish phosphocreatine if 
they are depleted of ATP, especially during the acute phase, 
prior to any hypoxia-response (Bertout et al., 2008; Semenza, 
2013; Wheaton and Chandel, 2011)? Earlier clinical studies 
have described the detection of CKB proteins and CKB activity 
in the sera of patients with various forms of malignancies and 
physiological insults (Fluddleston et al., 2005; Rubery et al., 
1982; Wyss and Kaddurah-Daouk, 2000). The presence of 
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extracellular ATP in the microenvironment of macrometastases 
was also reported (Pellegatti et al., 2008; Stagg and Smyth, 
2010). Interestingly, the liver is the main synthetic organ for cre- 
atine synthesis in the body. We hypothesized that colorectal 
cancer cells may release CKB into the extracellular space, which 
can then convert extracellular ATP and liver-produced creatine 
into phosphocreatine that is then taken up by cancer cells, 
thereby exerting a protective effect on hypoxic colorectal cancer 
cells prior to their adaptation to the hypoxic liver microenviron- 
ment. We thus investigated the possibility of metastatic colo- 
rectal cancer cells releasing CKB extracellularly. Indeed, extra- 
cellular CKB was released from metastatic LvM3b cells, but 
not LvM3b cells expressing a CKB targeting shRNA (Figure 4A). 
To determine if extracellular CKB was released from live or dying 
cells, we expressed CKB tagged with a FL7\G-epitope through 
a linker containing a caspase-3/7 recognition DEVD motif. 
Caspase activation in apoptotic cells would result in caspase 
recognition and cleavage of the DEVD motif between the 
FLAG-epitope tag and CKB, causing loss of the FLAG-epitope 
from the expressed CKB (Figure 4B). We observed that extra- 
cellular CKB was released primarily by live cells, as the FLAG- 
epitope was not lost (Figure 4C). As generation of extracellular 
phosphocreatine requires exogenous ATP and creatine, we 
confirmed the presence of extracellular ATP in the microenvi- 
ronment of incipient micrometastases using a plasma mem- 
brane-anchored luciferase reporter for detecting extracellular 
ATP (pME-Luciferase; Figure 4D) (Pellegatti et al., 2005). If the 
prometastatic effects of CKB result from utilization of extracel- 
lular ATP as a substrate, then depleting extracellular ATP should 
suppress the prometastatic activity of CKB. Indeed, expressing 
CD39, a plasma membrane anchored ATP hydrolase in SW480 
cells significantly precluded the ability of CKB overexpression 
to promote metastasis without affecting CKB levels (Figure 4E). 
Consistent with CKB consumption of extracellular ATP, cells 
overexpressing CKB or cells whose endogenous mlR-483-5p 
or mlR-551a were inhibited displayed significantly lower extra- 
cellular ATP levels in vivo relative to the control cells (Figures 
4F and S4A). Conversely, the microenvironment surrounding 
CKB knockdown cells displayed higher extracellular ATP levels 
(Figure S4B). If extracellular CKB catalysis can enhance metas- 
tasis, we reasoned that supplying the product of CKB-mediated 
catalysis, phosphocreatine, in the extracellular space should 
rescue the effect of CKB loss-of-function. In order to test this, 
we implanted a mini-osmotic pump that continuously released 
phosphocreatine into the peritoneal cavity of immunodeficient 
mice, which is eventually drained into the portal circulation. 
Remarkably, exogenous phosphocreatine was sufficient to 
significantly enhance metastasis (>1 0-fold) by CKB-depleted 
cells in vivo (Figure 4G). We next used a Boyden chamber cocul- 
ture system to determine if colorectal cancer cells overexpress- 
ing CKB could promote the survival of CKB knockdown cells un- 
der hypoxia (Figure 4H). CKB overexpressing cells were indeed 
able to compensate for the survival of CKB knockdown cells 
across the trans-well, while addition of a CKB-activity neutral- 
izing antibody abrogated this effect (Figure 41). We extended 
our findings to an in vivo system using colorectal cancer cells 
depleted of intracellular CKB but expressing a secreted form of 
CKB fused to the IgK secretory signal sequence. Remarkably, 



overexpressing secreted CKB was sufficient to enhance colo- 
rectal cancer metastasis (Figure 4J). We further examined the 
levels of serum CKB in mice injected with CKB knockdown cells. 
Interestingly, we found that mice with escaped tumors invariably 
had increased serum CKB levels (Figure S4C). 

The Creatine Transporter SLC6a8 Modulates 
CKB-Mediated Colorectal Cancer Metastasis 

Flaving implicated exogenous creatine/phosphocreatine meta- 
bolism in colorectal cancer metastasis, we next investigated 
the regulation of creatine/phosphocreatine metabolism in colon 
cancer metastatic progression. Depletion of guanidinoacetate 
methyltransferase (GAMT), the enzyme required for the final 
step of creatine synthesis, in colon cancer cells did not affect 
metastasis (Figure S5A), consistent with a model wherein extra- 
cellular (the liver is the primary site of creatine biosynthesis) 
rather than intracellular creatine drives colorectal cancer metas- 
tasis. Next, we asked if SLC6a8, a transporter of creatine com- 
pounds (Salomons et al., 2001) modulates phosphocreatine 
levels in colon cancer cells. Indeed, SLC6a8 knockdown re- 
duced intracellular phosphocreatine and ATP levels (Figure 5A). 
If extracellular phosphocreatine uptake promotes metastasis, 
such impairment in phosphocreatine levels should reduce 
metastasis. Indeed, multiple colon cancer cell lines depleted of 
SLC6a8 displayed substantially reduced (10- to 100-fold) meta- 
static activity (Figures 5B and S5B), whereas metastatic tumors 
that eventually grew out from SLC6a8 knockdown cells were es- 
capers and displayed restored SLC6a8 expression (Figure S5C). 
Importantly SLC6a8 knockdown, which depleted cellular uptake 
of extracellular phosphocreatine, also abrogated the effect 
of CKB overexpression on colorectal cancer metastasis (Fig- 
ure 5C), revealing extracellular phosphocreatine uptake to be 
downstream of CKB catalysis. Additionally, depleting SLC6a8 
in CKB knockdown cells abrogated the protective effect of phos- 
phocreatine during hypoxic stress (Figure 5D). Finally exoge- 
nously added phosphocreatine was not able to promote liver 
metastasis by SLC6a8 knockdown cells (Figure S5D). These 
findings reveal SLC6a8 to be downstream of CKB and phospho- 
creatine and to mediate their metastasis promoting effects. 

miR-483-5p, miR-551a, CKB, and SLC6a8 Associate 
with Human Progression Stage 

We next analyzed the expression levels of mlR-483 and mlR- 
551a in a set of 66 surgically resected human primary colon 
cancer and liver metastases from MSKCC. Consistent with a 
metastasis-suppressive role for these mlRNAs during cancer 
progression, mlR-483 and mlR-551a both independently dis- 
played significantly reduced expression levels in human liver me- 
tastases relative to primary colon cancers (Figures 6A and 6B; 
p = 0.02 for mlR-483 and p = 0.001 for mlR-551a; N = 66). 
CKB (Figure 6C; p = 0.05, N = 233) and SLC6a8 (Figure 6D; 
p < 0.001, N = 233) expression levels were significantly higher 
in liver metastases relative to primary tumors in an independent 
publicly available gene expression data set, suggesting a selec- 
tion for enhanced expression of CKB and SLC6a8 during 
progression. We also constructed a tissue microarray from a 
collection of primary colorectal tumors and liver metastases 
that were surgically resected from patients at Weill-Cornell 
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Figure 4. CKB Is Secreted by Colorectal Can- 
cer Cells and Promotes Malignant Conversion 
of Extracellular ATP and Liver Creatine to 
Phosphocreatine to Enhance Metastasis 

(A) Extracellular and intracellular CKB protein levels In 
control and LvM3b cells depleted of CKB through 
RNAi. 

(B) FLAG-tagged CKB with a caspase 3/7 recognition 
site linker. The FLAG-DEVD-CKB has a FLAG-tag 
linked to the N terminus of CKB by a linker containing 
a caspase 3/7 recognition motif (DEVD-amino se- 
quence). Caspase activation in apoptotic cells will 
result in cleavage of linker and release of FLAG-tag. 

(C) Western blot of FLAG-DEVD-CKB overex- 
pressing cells demonstrate release of CKB by non- 
apoptotic cells into the extracellular space. 

(D) Bioluminescent imaging of immunodeficient mice 
injected with SW480 cells expressing pME-Luc for 
detection of extracellular ATP (n = 5). 

(E) Liver metastasis by 5 x 10® SW480 cells over- 
expressing CKB with concomitant overexpression of 
CD39. 

(F) Relative extracellular ATP levels in CKB over- 
expressing cells. Control and CKB overexpressing 
pME-Luc SW480 cells were injected into mice (n = 5). 

(G) Liver metastasis by CKB-depleted LvM3b cells in 
mice implanted with an osmotic pump releasing 
phosphocreatine into the portal circulation. 

(H) Scheme for coculture experiment. CKB-knock- 
down (5 X 10'^) cells were cultured on the bottom of 
24-well plates, while control or CKB-overexpressing 
cells were plated onto Boyden chambers above 
CKB-knockdown cells with pores for exchange of 
metabolites and proteins. Cells at the bottom of the 
well were counted after 4 days in hypoxia. 

(I) Relative survival of CKB-knockdown cells in 1 % 
oxygen when cocultured with control, CKB-over- 
expressing cells or with CKB-overexpressing cells in 
the presence of a neutralizing antibody (n = 4). 

(J) Liver metastasis by endogenous CKB-knockdown 
SW480 cells overexpressing a secreted form of CKB. 
Error bars represent SEM; all p values are based on 
one-sided Student’s t tests, or where appropriate, 
Mann-Whitney test for non-Gaussian distribution. 
*p < 0.05; **p < 0.01 : ***p < 0.001 . 

See also Figure S4. 
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Figure 5. SLC6a8 Modulates CKB-Mediated Colon Cancer Metastasis through Modulation of Intracellular Phosphocreatine Levels 

(A) Relative intracellular phosphocreatine and ATP levels in LvM3b cells expressing a control shRNA or shRNA targeting SLC6a8 (n = 4). 

(B) Liver metastasis by 5 x 10^ LvM3b cells expressing two independent short hairpins targeting SLC6a8 (n = 5). 

(C) Liver metastasis by SW480 cells overexpressing CKB with and without SLC6a8 depletion (n > 4). 

(D) In vitro survival of LvM3b cells depleted of CKB, SLC6a8 with and without phosphocreatine supplementation in hypoxia (n = 3). Error bars represent SEM; all p 
values are based on one-sided Student's t tests, or where appropriate, Mann-Whitney test for non-Gaussian distribution. *p <0.05; **p < 0.001 ; ***p < 0.0001 . 
See also Eigure S5 and Table S4. 



Medical Center and immunohistochemically stained for CKB and 
SLC6a8 expression. CKB (Figure 6E, p < 0.05, N = 92) and 
SLC6a8 (Figure 6F, p < 0.001 , N = 88) protein expression levels 
were found to be elevated in liver metastases relative to primary 
tumors of patients. These findings are consistent with, and sug- 
gest the pathophysiological basis for, previous studies revealing 
elevated expression levels of CKB in advanced stage cancer 
(Wallimann and Flemmer, 1994) and reveal significant associa- 
tion between the components of this multi-miRNA network and 
colon cancer progression. 

miR-483-5p, miR-551a, and CKB Modulation Provides 
Clinical Benefit 

We sought to investigate the therapeutic potential of targeting 
this clinically relevant miRNA regulatory network. We first tested 
the ability of adeno-associated virus to transduce colon cancer 
cells in vitro and detected viral genomic DNA (gDNA) in colon 
cancer cells even with low multiplicity of infection (Figure S6A). 
Injection of mice bearing macroscopic hepatic metastases with 



adeno-associated virus revealed that adeno-associated virus 
was able to infect colon cancer metastases in vivo (Figure S6B). 
We next injected mice with 5x10^ highly metastatic LvM3b cells 
and 24 hr later administered a single intravenous dose of adeno- 
viral-associated virus (AAV) encoding miR-483-5p and miR-551 a 
from a single transcript. Surprisingly, a single therapeutic dose of 
AAV delivering both miRNAs dramatically and significantly 
reduced metastatic colonization (>5-fold; Figure 6G). Therapeu- 
tic efficacy was also seen in mice injected with SW480 cells (Fig- 
ure 6H). Importantly, this treatment did not cause any adverse 
phenotypic outcomes or pathological abnormalities in mice. 
Moreover, autopsied mice treated with this AAV therapy did 
not harbor spontaneous tumors. Even mice injected with 
BEAS-2B immortalized lung epithelial cells that are prone to 
oncogenic transformation did not develop tumors (Figure S6C). 
We next investigated the therapeutic potential of targeting this 
metabolic network by determining the impact of small-molecule 
inhibition of CKB on colon cancer metastasis using cyclocrea- 
tine. Therapeutic treatment of mice with cyclocreatine after 
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Figure 6. miR-483-5p, miR-551a, CKB, and SLC6a8 Are Clinically Relevant in Independent Cohorts of Patients and Can Be Therapeutically 
Targeted 

(A and B) miR-483-5p and miR-551a levels in 36 primary colorectal adenocarcinomas and 30 liver metastases were quantified by quantitative real-time PCR. 
(C and D) CKB and SLC6a8 expression from a public microarray data set (GSE41258) comparing primary tumors and liver metastases (N = 233). 



(legend continued on next page) 
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Figure 7. Model for the miR-483-5p, miR-551a, CKB, and SLC6a8 
Axis 

Disseminated colon cancer ceils arrive In the liver microenvironment through 
the hypoxemic portal circulation. Within the liver microenvironment, they 
experience hypoxic stress and ATP depletion. Cells that upregulate CKB 
through loss of mlRNAs, release CKB into the extracellular matrix where it 
converts available creatine and ATP into phosphocreatine that is then taken up 
by the cell to fuel metastatic survival and subsequent organ colonization. 
Colon cancer cells with higher levels of CKB also build up a larger pool of 
intracellular phosphocreatine that acts as a buffer against energetic stress. 



colorectal cancer cell inoculation also significantly reduced met- 
astatic colonization, demonstrating proof-of-principle for target- 
ing this kinase as a means of metastasis suppression (Figure 6I). 
Given that the liver is a common site of metastasis for other 
gastrointestinal cancers such as pancreatic cancer, we sought 
to determine if knockdown of components in this pathway could 
also inhibit metastasis by pancreatic cancer cells. Knockdown of 
CKB and SLC6a8 in PANC1 cells (a KRAS mutant human 
pancreatic line) with multiple shRNAs, strongly suppressed their 
ability to metastasize (Figures 6J and S6G >1 0-fold). This finding 
suggests that CKB and SLC6a8, and their associated metabolic 
pathway, may more broadly govern liver metastasis by other 
gastrointestinal cancers. 

DISCUSSION 

Colorectal cancer is diagnosed in over a million patients a year 
globally, with the majority of advanced stage patients (over 
600,000) experiencing liver metastatic progression (Jemal 
et al., 2011). Using a systematic approach, we have identified 
two miRNAs that act as suppressors of liver metastatic coloniza- 
tion by colon cancer cells. MiR-483-5p had been recently re- 
ported to be oncogenic in lung adenocarcinoma by enhancing 
invasion and progression and tumor growth in IGF2-dependent 
sarcoma (Song et al., 2014), while miR-551a has been recently 



implicated in suppressing gastric cancer (Li et al., 2012b). Given 
that miRNAs are widely known to act in a context-specific 
manner, we had experimentally validated the role of these miR- 
NAs in suppressing liver metastasis by colon cancer cells and we 
find that these miRNAs suppress a metabolic axis that drives 
liver colonization by convergent targeting of CKB— a key gene 
that allows colon cancer cells to expand their phosphocreatine 
reserves during periods of positive cellular energy balance 
(Wyss and Kaddurah-Daouk, 2000). Moreover, enhanced phos- 
phocreatine reserves, which can fuel ATP generation, provide 
great utility during periods of intense energetic requirement 
such as during metastatic colonization of the liver microenviron- 
ment— a hypoxic microenvironment that also contains perive- 
nous hepatocytes that compete with cancer cells for glycolytic 
substrates (Jungermann and Kietzmann, 1996). We find that co- 
lon cancer cells, which enter the liver via the hypoxemic portal 
circulation, undergo substantial cell death. Cells capable of 
generating sufficient phosphocreatine reserves are better able 
to survive the initial phase of hypoxic hepatic colonization. Colon 
cancer cells that survive the initial selective pressure of hypoxic 
hepatic dissemination can then activate pathways involved in 
energy homeostasis and generation (DeBerardinis et al., 2008; 
Flardie et al., 2012; Inoki et al., 2012; Jeon et al., 2012; Kaelin 
and McKnight, 2013; Semenza, 2011) and harness additional 
prometastatic programs for successful competition and further 
colonization of the liver (Chiang and Massague, 2008). 

As incipient metastatic cells are dependent on CKB-mediated 
intracellular phosphocreatine generation, which arises from miR- 
483-5p/miR-551a silencing and CKB overexpression, a selec- 
tion for subpopulations that exhibit silencing of these miRNAs 
and consequent induction of CKB would occur. We demonstrate 
that cells, which overexpress the metabolic kinase CKB, are able 
to survive and progress within the hepatic parenchyma. Flow- 
ever, in the context of energetic stress and ATP limitation, a 
paradoxical situation would arise wherein intracellular phospho- 
creatine generation from ATP would comprise a futile cycle. 
Interestingly, we find that colon cancer cells secrete CKB into 
the extracellular space, where it catalyzes the ATP-dependent 
phosphorylation of creatine— yielding phosphocreatine (Fig- 
ure 7). Extracellular phosphocreatine had been shown to be pro- 
tective against hypoxic, ischemic, and other energetic insults in 
neurons and myocardium, with increased phosphocreatine up- 
take observed in ischemic myocardium (Brustovetsky et al., 
2001 ; Li et al., 2012a; Sharov et al., 1987). Our findings suggest 
that colon cancer cells can actually generate extracellular phos- 
phocreatine and import it as a means of enhancing energy 



(E) CKB expression in primary tumors compared to liver metastases examined through immunohistochemical staining of a tissue microarray (N = 92). Scale bar 
represents 50 |.im. 

(F) SLC6a8 expression in primary tumors compared to liver metastases examined through immunohistochemical staining of above mentioned tissue microarray 
(N = 88). Scale bar represents 50 |.im. 

(G) Liver metastasis in mice injected with 5x10® LvM3b cells and treated with a single dose of AAV doubly expressing mlR-483-5p and mlR-551a 1 day after 
injection cells (n = 6). 

(H) Liver metastasis in mice injected with 5x10® SW480 cells and treated with a single dose of AAV doubly expressing miR-483-5p and miR-551a 1 day after 
injection cells (n = 4). 

(I) Liver metastasis in mice injected with 5x10® LvM3b cells and treated with cyclocreatine daily for 2 weeks {n > 1 5). 

(J) Liver metastasis by pancreatic cancer cells, PANC1 , with knockdown of CKB with two independent shRNA hairpins (n = 5). Error bars represent SEM; all p 
values are based on one-sided Student's t tests, or where appropriate, Mann-Whitney test for non-Gaussian distribution. *p < 0.05; **p < 0.01 ; ***p < 0.001 . 
See also Tables S4 and S5. 
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stores. Interestingly, we find that colon cancer cells have devel- 
oped a remarkable adaptive mechanism— secretion of CKB— 
that enables them to catalytically enhance extracellular phos- 
phocreatine levels from exogenous precursors. The “harvest- 
ing” of extracellular metabolites through secretion of CKB by 
colon cancer cells thus represents a powerful mechanism for 
survival when malignant cells are highly vulnerable. 

While our findings reveal an important prometastatic role for 
extracellular CKB, we do not rule out the possibility that intracel- 
lular CKB could also play a role in cancer progression. One po- 
tential intracellular role for CKB might occur during contexts 
when cancer cells have adequate levels of ATP, such as during 
primary tumor growth in the colonic epithelium, or subsequent 
to metastatic tumor expansion and recruitment of functional 
blood vessels, which could provide adequate oxygenation and 
glucose for fueling ATP generation. In such contexts, intracel- 
lular CKB could allow cancer cells to expand their intracellular 
buffer of phosphocreatine to be drawn upon during subsequent 
periods of reduced nutrients. While we have identified one 
pathway that subserves cancer progression, it is possible that 
malignant cells conduct other metabolic reactions in the extra- 
cellular space that could allow for the extraction of extracellular 
energy and substrates, or the degradation of toxic metabolic 
wastes. 

Our findings from human primary and metastatic tissue spec- 
imens support our model (Figure 7), revealing enhanced 
expression of CKB and reduced expression of its repressive 
miRNAs in liver metastases— consistent with a selection for 
cells with molecular activation of this pathway in the hepatic pa- 
renchyma. While our findings have revealed a key pathway that 
governs colon cancer metastatic colonization of the liver, the 
predominant site of metastasis by colorectal cancer, and sug- 
gest that the in vivo-selected metastatic sublines we have 
derived display the same organotropism for the liver that the 
majority of human colorectal cancers display, we do not 
know if this pathway also regulates the colonization of other or- 
gans such as the lungs by these cells. We speculate that part of 
this organotropism may arise from the production of creatine 
by the liver— the primary organ for creatine biosynthesis. Our 
findings that CKB and SLC6a8 depletion suppress pancreatic 
cancer metastasis highlight the possibility that additional 
gastrointestinal cancer types, which exhibit tropism for the liver, 
may also utilize this pathway and as such, may be also vulner- 
able to CKB inhibition therapy. 

While the promise of miRNA therapeutics has been great, its 
actual clinical implementation has been more limited given inad- 
equate stability and delivery of small RNA therapeutics to target 
tissues. The liver, however, is an exceptional organ in this 
respect, because miRNAs and RNAi molecules accumulate to 
higher degrees in the hepatic parenchyma relative to other or- 
gans and we have demonstrated adeno-associated virus to be 
efficient in infecting colon cancer metastases. Additionally, ad- 
eno-associated approaches of gene delivery have demonstrated 
proof-of-concept in human trials (Nathwani et al., 2011). Given 
these features of the liver, our identification of metastasis sup- 
pressor miRNAs in colorectal cancer and our proof-of-principle 
demonstration of their therapeutic activity have important clinical 
implications because both nanoparticle-mediated and adeno- 



associated viral delivery of these metastasis suppressors could 
be viable paths clinically. A more conventional path toward tar- 
geting this pathway could be the development of highly potent 
inhibitors of CKB that would act in a similar manner as cyclocrea- 
tine, given that highly specific kinase inhibitors can be designed 
(Dar and Shokat, 2011; Davis and Schlessinger, 2012). Impor- 
tantly, such inhibitors do not have to be cell-permeable, because 
they would be targeting extracellular CKB. This could substan- 
tially increase the therapeutic index of such compounds. The 
poor overall efficacy of the current standard-of-care chemo- 
therapy regimen FOLFOX in reducing metastatic relapse rates 
in high-risk patients necessitates the development and testing 
of such targeted therapeutic approaches in this prevalent 
disease. 

EXPERIMENTAL PROCEDURES 
Animal Studies 

All animal work was conducted in accordance with a protocol approved by 
the Institutional Animal Care and Use Committee (lACUC) at The Rockefeller 
University. Age-matched male NOD-SCID mice (5- to 6-week-old) were 
used for organotypic slice culture, intrahepatic colonization, and liver metas- 
tasis assays involving LS1 74T, SW620, WIDR, LvM3a, and LvM3b cell-lines. 
Age-matched male NOD/SCID gamma male mice (5- to 6-week-old) were 
used for liver metastasis assays for the SW480 and PANC1 cell lines. 

Lenti-miR Library Screening 

Cells were transduced with a lentiviral Lenti-miR library (System Biosciences) 
at a low multiplicity of infection (mol < 1) such that each cell overexpressed a 
single mlRNA. The transduced population was then injected intrahepatically 
into NOD-SCID mice for in vivo selection of miRNAs that when overexpressed, 
either promoted or suppressed metastatic liver colonization. Genomic DNA 
PCR amplification and recovery of lenti-viral miRNA inserts was performed 
on cells prior to injection and from liver nodules according to the manufac- 
turer's protocol. miRNA array profiling allowed for miRNA insert quantification 
prior to and subsequent to in vivo selection. 

Organotypic Slice Culture System 

Cells to be injected were labeled with cell-tracker red or green (Invitrogen) and 
inoculated into the livers of NOD-SCID or NOD-SCID gamma mice through in- 
trasplenlc Injection. The livers were then extracted and cut into 150 nm slices 
using a Mcllwain tissue chopper (Ted Pella) and plated onto organotypic tissue 
culture inserts (Millipore) and cultured in William’s E Medium supplemented 
with Hepatocyte Maintenance Supplement Pack (Invitrogen). After indicated 
time periods, the liver slices were fixed in paraformaldehyde and imaged. 
Extended protocol can be found in the Extended Experimental Procedures. 

Adeno-Associated Viral Therapy 

miR-483-5p and miR-551a were cloned as a polycistron consisting of both 
miRNA precursor with flanking genomic sequences in tandem into the Bglll 
and Notl site of scAAV.GPP (Plasmid 21893, Addgene). Adeno-associated vi- 
rus was packaged, purified and titered by Vector Biolabs. One day after mice 
were inoculated with colorectal cancer cells, 1 x 1 0^^ AAV viral particles were 
injected into each mouse through intravenous injection. 

Cyclocreatine Treatment of Mice 

One day after inoculation of colon cancer cells, mice were injected with 10 mg 
of cyclocreatine in 300 ul PBS. Treatment was continued daily for 2 weeks until 
the mice were euthanized. 

shRNA and Primer Sequences 

shRNA, primers, and cloning sequences are listed in Tables S4, S5, and S6. 

Additional experimental procedures can be found in the Extended Experi- 
mental Procedures. 
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SUMMARY 

Effective silencing by RNA-interference (RNAi) de- 
pends on mechanisms that amplify and propagate 
the silencing signal. In some organisms, small- 
interfering RNAs (siRNAs) are amplified from tar- 
get mRNAs by RNA-dependent RNA polymerase 
(RdRP). Both RdRP recruitment and mRNA silencing 
require Argonaute proteins, which are generally 
thought to degrade RNAi targets by directly cleaving 
them. However, in C. elegans, the enzymatic activity 
of the primary Argonaute, RDE-1 , is not required for 
silencing activity. We show that RDE-1 can instead 
recruit an endoribonuclease, RDE-8, to target RNA. 
RDE-8 can cleave RNA in vitro and is needed for 
the production of 3' uridylated fragments of target 
mRNA in vivo. We also find that RDE-8 promotes 
RdRP activity, thereby ensuring amplification of 
siRNAs. Together, our findings suggest a model in 
which RDE-8 cleaves target mRNAs to mediate 
silencing, while generating 3' uridylated mRNA frag- 
ments to serve as templates for the RdRP-directed 
amplification of the silencing signal. 

INTRODUCTION 

RNA interference (RNAi) is an ancient gene-siiencing mechanism 
that empioys evoiutionariiy conserved ribonuciease proteins 
cailed Argonautes. Argonautes achieve sequence-specific tar- 
geting through association with smaii RNA guides of ~20-30 nu- 
cieotides (for review, see Ghiidiyai and Zamore, 2009). Pathways 
related to RNAi are as diverse as the organisms in which they are 
found and reguiate a remarkabie array of bioiogical phenomena 
(Castei and Martienssen, 2013; Conine et al., 2013; Seth et al., 
2013). 

In C. elegans, RNAi triggered by foreign doubie-stranded RNA 
(dsRNA) (referred to herein as exo-RNAi) is a two-step Argonaute 
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response (Yigit et ai., 2006). The primary Argonaute RDE-1 is 
ioaded with smaii interfering RNAs (siRNAs) processed from 
dsRNA by the ribonuciease-iii-reiated enzyme Dicer (DCR-1). 
Target recognition by RDE-1/siRNA compiexes initiates the 
ampiification of antisense secondary siRNAs, which are synthe- 
sized de novo by RNA-dependent RNA poiymerase (RdRP) 
and are primarily 22 nt with a 5'-triphosphorylated guanosine 
(22G-RNAs; Gu et ai., 2009; Pak and Fire, 2007; Sijen et ai., 
2001, 2007). Secondary siRNAs are ioaded onto a famiiy of 
worm-specific Argonautes (WAGOs), which iack catalytic-site 
metai-coordinating residues and thus mediate siiencing through 
an unknown mechanism (Yigit et al., 2006). WAGOs include 
cytoplasmic and nuciear members that aiso function in muiti- 
ple endogenous smaii RNA pathways to siience transposons, 
cryptic or aberrant genes, and foreign sequences (Gu et al., 
2009; Guang et al., 2008, 2010; Shirayama et al., 2012; Yigit 
et ai., 2006). 

Endogenous smaii RNA pathways in C. elegans can be classi- 
fied by their dependence on a primary Argonaute. For exampie, 
the Piwi orthoiog PRG-1 uses genomicaliy encoded piRNAs 
(21U-RNAs; Batista et al., 2008; Das et al., 2008; Ruby et al., 
2006) to recognize targets with incomplete base-pairing compie- 
mentarity and initiate a stabie and heritabie mode of epigenetic 
siiencing known as RNAe (Ashe et al., 2012; Bagijn et al., 2012; 
Buckley et al., 2012; Lee et al., 2012; Shirayama et ai., 2012). 
The maintenance of RNAe does not require PRG-1 activity but 
rather depends on RdRPs and both nuciear and cytopiasmic 
WAGOs, as weii as chromatin factors (Ashe et ai., 2012; Buckiey 
et ai., 2012; Lee et ai., 2012; Shirayama et al., 2012). How the 
small RNA amplification machinery recognizes RNAe targets to 
maintain 22G-RNA ieveis at each generation remains unknown. 

The ERi (for enhanced RNAi; Kennedy et ai., 2004) pathway is 
a two-step Argonaute pathway that directiy competes with the 
exo-RNAi pathway for availabie WAGOs (Duchaine et ai., 2006; 
Gent et ai., 2010; Vasaie et ai., 2010; Yigit et ai., 2006). The ERI 
pathway requires both an RdRP (RRF-3) and DCR-1 to generate 
26-nt siRNAs with a 5'-monophosphorylated G (Duchaine et al., 
2006; Pavelec et al., 2009; Ruby et al., 2006; Vasaie et al., 2010). 
The 26G-RNAS are loaded onto the Argonaute ERGO-1. 
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Targeting by ERGO-1/26G-RNAs initiates 22G-RNA biogenesis 
by RdRPs (RRF-1 and EGO-1 ) and siiencing by nuclear and cyto- 
plasmic WAGOs (Gent et al., 2010; Guang et al., 2008; Vasale 
etal.,2010). 

Here, we describe a previously uncharacterized RNAi-defi- 
cient mutant, rde-8. RDE-8 protein contains a ribonuclease 
domain known as an N4BP1, YacP Nuclease (NYN) domain 
(Anantharaman and Aravind, 2006) and is related to the 
Zc3h12a ribonuclease (Matsushita et al., 2009). We show that 
RDE-8 is required for the accumulation of two classes of 
RdRP-dependent small RNAs: RRF-1 -dependent 22G-RNAs 
and RRF-3-dependent 26G-RNAs. We further show that RDE- 
8 is required for efficient RRF-1 RdRP activity in vitro. Using 
RNA immunoprecipitation (RIP), we show that RDE-8 associates 
with target mRNAs during exo-RNAi in an RDE-1 -dependent but 
RdRP-independent manner. We identify RDE-8 homologs and 
RNAi and transposon-silencing factors as RDE-8-interacting 
proteins, and we show that RDE-8 localizes to germline Mutator 
foci. Using 3' rapid amplification of cDNA ends (RACE), we show 
that RDE-8 promotes the accumulation of target mRNA frag- 
ments tailed with untemplated 3' uridine residues. Our findings 
are consistent with a role for RDE-8 both in mediating mRNA 
cleavage and promoting amplification of the silencing signal. 

RESULTS 

rde-8 Encodes a NYN Domain Ribonuclease 

In a genetic screen for worms with an RNAi-deficient (Rde) 
phenotype, we isolated three independent alleles {ne3309, 
ne3360, and ne3361) of a gene we have named rde-8. In addition 
to the RNAi-deficient phenotype (Figure 1A), we observed a 
slight developmental delay, increased sensitivity to Orsay virus 
infection, and germline-transgene desilencing in rde-8(ne3361) 
(Figure SI ). Using single-nucleotide polymorphisms and 3-factor 
analyses, we mapped the rde-8 gene to a small interval on chro- 
mosome IV. Sequencing of candidate genes within this interval 
revealed that all three alleles harbor the same single-nucleotide 
(nt) substitution in exon IV of the gene ZC477.5, resulting in a 
nonsense mutation at tryptophan 189 (Figure IB). Western blot 
analyses failed to detect the RDE-8 protein in rde-8(ne3361) 
lysates (Figure 1C), suggesting that ne3361 is a null or strong 
loss-of-function allele. Two deletion alleles of rde-8 (tm2252 
and tm2192) that remove all or part of exons 4, 5, and 6 (Fig- 
ure 1 B) exhibited the Rde phenotype and failed to complement 
rde-8(ne3361) (data not shown). Finally, an integrated single- 
copy gfp::ZC477.5 transgene rescued the Rde phenotype 
of rde-8(ne3361) (Figures 1A and SI). These data identify 
ZC477.5 as rde-8. 

RDE-8 is predicted to encode a 339 amino acid protein homol- 
ogous to prokaryotic, archaeal, and eukaryotic NYN domain ri- 
bonucleases (Figure IB; Anantharaman and Aravind, 2006). 
Notably, gfp::rde-8 transgenes bearing mutations in conserved 
aspartic acid residues (either D76N alone or D145A and D146A 
together) that map to the catalytic site of Zc3h12a failed to 
rescue the Rde phenotype of rde-8(ne3361) (Figure 1A and 
data not shown). Western blot analysis of RDE-8 revealed that 
the expression of GFP::RDE-8(D76N) protein is comparable to 
endogenous RDE-8 and wild-type (WT) GFP::RDE-8 (Figure 1C). 



These findings suggest that an intact catalytic domain is required 
for RDE-8 activity. 

To directly test whether RDE-8 encodes a ribonuclease, we 
purified recombinant, histidine-tagged RDE-8(WT) and RDE- 
8(D76N) proteins by nickel-chelating resin, anion-exchange, 
and gel-filtration chromatography (Figure ID). We incubated 
recombinant RDE-8(WT) or RDE-8(D76N) proteins with an inter- 
nally labeled 116-nt single-stranded RNA using conditions that 
support in vitro Zc3h12a nuclease activity (Matsushita et al., 
2009). RDE-8(WT) degraded the RNA substrate into variable 
size fragments, with prominent products of ~20 nt and 30 nt 
(Figure IE). These products did not accumulate in reactions 
with recombinant RDE-8(D76N). Instead, an ~85 nt product 
accumulated in the RDE-8(D76N) reactions and to much lower 
levels in RDE-8(WT) reactions. This product could represent 
an intermediate or, alternatively, the product of a bacterial 
nuclease contaminating the RDE-8 preparations. These data 
indicate that RDE-8 encodes an endoribonuclease required 
for RNAi. 

RDE-8 Is Required for the Accumulation of 
RdRP-Dependent Small RNAs 

To explore where RDE-8 functions in the RNAi pathway, we 
examined small RNA production in mutant and WT rde-8 trans- 
genic strains exposed to dsRNA targeting the nonessential 
gene sel-1 (Figure 2). Northern blot analysis revealed that sel-1 
siRNAs were reduced in rde-8(ne3361) relative to WT (Figure 2A) 
and were rescued in gfp::rde-8(+), but not in gfp::rde-8(D76N) 
transgenic animals (Figures 2A and 2B). The microRNAs let-7 
and miR-66 were unaffected and serve as loading controls (Fig- 
ures 2A and 2B). We also cloned and deep sequenced small 
RNAs from rde-8(ne3361) mutants expressing gfp::rde-8(+) or 
gfp::rde-8(D76N) and exposed to sel-1 (RNAi). Consistent with 
the northern blot data, we detected secondary siRNAs 5' of the 
trigger in gfp::rde-8(+) worms after 8 hr of exposure to sel-1 
dsRNA, but not in the gfp::rde-8(D76N) mutant sample (Fig- 
ure 2C). By 24 hr, sel-1 siRNAs throughout the transcript were 
more abundant in the WT sample than in the rde-8 mutant sam- 
ple (Figure 2C). Thus, the RNAi defect of rde-8 mutants corre- 
lates with failure to accumulate RdRP-derived siRNAs. 

We also monitored the effect ot rde-8 on the accumulation of 
endogenous 22G-RNAs (Figures 2D and S2). We found that 22G- 
RNAs were reduced at least 2-fold in the rde-8{ne3361) mutant 
for 42% (n = 4,632) of genes with at least 10 antisense reads 
per million total reads in WT. The levels of microRNAs and 
21U-RNAS were unaffected in rde-8(ne3361) mutants (Figures 
2A, 2B, and S2). The 22G-RNA defect of rde-8(ne3361) was 
strongly rescued (87% of target genes; n = 1,938) by the 
gfp::rde-8(+) transgene but only partially (33.7% of target genes; 
n = 1 ,938) by the active-site mutant gfp::rde-8(D76N) transgene 
(Figure 2D). 

Examining the levels of 22G-RNAs antisense to genes tar- 
geted by ERGO-1, WAGO, or CSR-1 (Claycomb et al., 2009; 
Gu et al., 2009; Vasale et al., 2010), we found that 22G-RNAs 
antisense to WAGO and ERGO-1 targets were reduced in 
rde-8 mutants, whereas 22G-RNAs antisense to CSR-1 targets 
were mostly unaffected (Figure 2E and Table SI). The ERGO- 
1- and WAGO-dependent 22G-RNA defects were rescued by 
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Figure 1. rde-8 Encodes a Conserved Ribonuclease Required for RNAi 

(A) Graphical representation of RNAi sensitivity in WT, rde-8(ne3361), and transgenic strains (as indicated). Percent lethal indicates the mean percentage of pos-7 
dead eggs (green bars) or the percentage o1 let-2 ruptured or sterile adults observed (red bars), n, number of animals exposed to RNAi. 

(B) Schematic of the rde-8 locus showing exons (boxes) and intron (lines) with the ribonuclease domain shaded light brown. Deletion (red lines) and nonsense 
(asterisk) alleles are indicated. The alignment shows C. elegans (ce), Drosophila (dm), mouse {mm), and human (hs) homologs with conserved residues (shaded 
brown) and catalytic residues (black background). The asterisk indicates the tryptophan codon (W) mutated in three nonsense alleles. 

(C) Immunoblot analysis of RDE-8, GFP::RDE-8, and GFP::RDE-8(D76N) protein expression. Tubulin was probed as a loading control. Asterisks (*) indicate 
prominent non-specific bands detected by RDE-8 antibody. 

(D) Coomassie blue staining of purified WT and D76N recombinant RDE-8 proteins. 

(E) Denaturing PAGE analysis of recombinant RDE-8 nuclease activity. RDE-8 protein at different concentrations (indicated) was incubated with a 1 1 6-nt sel-1 
RNA (nt 414-529) internally labeled with ^^P-UTP. 

See also Figure SI. 



the gfp::rde-8(+) transgene, but not by the active-site mutant 
gfp::rde-8(D76N) transgene (Figure 2E). WAGO 22G-RNAS 
dependent on RDE-8 activity inciuded 22G-RNAs that aiso 
depend on the PRG-1/piRNA pathway (Tabie S2; Gu et ai., 
2009; Lee et ai., 2012). RDE-8-dependent 22G-RNAS aiso 
inciuded RDE-1/m/r-243-dependent 22G-RNAs that siience 
y47h10a.5 in the soma (Tabie SI; Correa et ai., 2010; Gu et ai., 
2009). Thus the smaii RNA defects of rde-8 mutants are consis- 
tent with the RNAi and transgene-siiencing defects of rde-8 
mutants and suggest that RDE-8 activity is required for the pro- 
duction or accumuiation of RdRP-dependent siRNAs that func- 
tion in the WAGO and ERI silencing pathways. 



To ask whether RDE-8 is required for the accumuiation of 
ERGO-1 26G-RNAS, we cioned and deep sequenced 5'-mono- 
phosphoryiated smaii RNAs. We found that 26G-RNAs were 
reduced at ieast 2-foid in the rde-8(ne3361) mutant at 98% 
(124/126) of ERGO-1 target mRNAs with a minimum of 10 anti- 
sense 26G-RNA reads per miiiion totai non-structurai reads in 
WT and at ieast 10-foid at 96% (121/126) of the affected ioci 
(Figure 2F). Interestingiy, in the gfp::rde-8(D76N) background, 
26G-RNAS were reduced by at least 2-fold at only 35% (44/ 
1 26) of target genes and at least 1 0-fold at only three of these 
targets (Figure 2F). Our data suggest that RDE-8 is required for 
the accumuiation of two different classes of RdRP-generated 
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Figure 2. RDE-8 Promotes RdRP-Depen- 
dent Small RNA Accumulation 

(A and B) Northern blot analyses of antisense sel-1 
siRNAs in WT, rde-8 mutant, and mutant trans- 
genic strains (as indicated). The probe hybridizes 
just upstream (5') of these/-) dsRNA trigger region 
(shown in C). let-7 and miR-66 miRNAs were 
probed as loading controls. 

(C) Histograms showing sense (blue) and anti- 
sense (red) small RNA reads mapping to the sel-1 
gene. Reads were normalized to total non-struc- 
tural reads. The sel-1 exons (boxes) and introns 
(lines) are indicated at bottom; dashed lines 
delineate the dsRNA trigger region. 

(D) Dot plots of endogenous 22G-RNAs targeting 
annotated genes in rde-8 (ne3361) mutant (top) 
and mutant transgenic strains (as indicated) 
gfp::rde-8(D76N) (middle) or gfp::rde-8(+) (bottom) 
compared to WT. “rpm” indicates the number of 
reads per million total reads for a given gene. The 
black diagonal indicates x = y. Dashed lines (gray) 
demark regions where loci show the indicated fold 
decrease of 22G-RNA reads compared to WT. 
Gray dots indicate loci that change less than 
2-fold. 

(E) Box and whisker plots comparing ERGO-1, 
WAGO, and CSR-1 pathway 22G-RNAs in rde- 
8(ne3361) mutant (pink) and mutant gfp::rde-8 
transgenic lines (+, orange; D76N, green). The 
ratio of mutant/(mutant + WT) is shown. The 75^^ 
through 25^^ percentile are boxed, with the median 
value shown as a horizontal line within the box. 
Dashed lines indicate 2-fold enrichment (top) and 
depletion (bottom). 

(F) Dot plots of endogenous 26G-RNA levels in 
rde-8 (ne3361) mutant (left) and mutant gfpr.rde- 
8(D76N) transgenic animals (right) compared to 
WT. Red dots represent ERGO-1 targets (Vasale 
etal., 2010), and gray dots represent loci with non- 
ERGO-1 -associated 26 nt antisense reads. 

See also Figure S2. 
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small RNAs, WAGO 22G-RNAs, and ERGO-1 26G-RNAs, but the 
ribonuclease activity of RDE-8 is not required for 26G-RNA 
accumuiation. 

RDE-8 Interacts with ERI/DICER and RNAi/Mutator 
Pathway Components 

To understand how RDE-8 promotes RNAi and 22G-RNA 
biogenesis, we sought to identify proteins that interact with 



RDE-8. Using size-exciusion chromatog- 
raphy to examine the moiecuiar weight 
of RDE-8 complexes in worm lysates, 
we found that endogenous RDE-8, which 
has a molecular weight of 38.8 kDa, 
migrated between 158 kDa (Aldolase) 
and 440 kDa (Ferritin) in gel filtration 
analysis (Figure S3). GFP::RDE-8 did not 
coimmunoprecipitate (co-IP) with endo- 
genous RDE-8 (data not shown), sug- 
gesting that the higher-molecular-weight 
complexes are not composed of RDE-8 multimers. Using multi- 
dimensional protein identification technology (MudPIT; Chen 
et al., 2006), we identified several RDE-8-interacting proteins 
whose loss-of-function phenotypes are similar to those of 
rde-8 (Figure 3A and Table S3), including the p-nucleotidyltrans- 
ferase RDE-3, MUT-15/RDE-5, and the Q/N domain protein 
MUT-16/RDE-6 (Chen et al., 2005; Vastenhouw et al., 2003; Gu 
et al., 2009; Zhang et al., 2011). 
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Figure 3. RDE-8 Interacts with Mutator 
Components and RDE-8 Homologs 

(A) Proteins identified by MudPiT anaiysis of GFP 
immunoprecipitates from transgenic gfp::rde-8(+) 
worms but not from WT worms that do not express 
GFP::RDE-8. The percent coverage and totai 
number of peptides are indicated for each RDE-8 
interactor. 

(B) Graphicai representation of RNAi sensitivity of 
WT or mutant strains (as indicated). Percent iethai 
indicates the mean percentage ofpos-1 dead eggs 
(green bars) or the percentage of let-2 ruptured or 
sterile aduits observed (red bars), n, number of 
animals exposed to RNAi. 

(C) Dot plots (as described in Pigure 2D) of 
endogenous 22G-RNAs targeting annotated genes 
in nyn-1 (tm5004);nyn-2(tm4844) doubie mutants 
compared to WT. 

(D) Box and whisker plots (as described in 
Eigure 2E) comparing ERGO-1, WAGO, and 
CSR-1 pathway 22G-RNAs in rde-8(ne3361) 
(orange), nyn-1 (tm5004);nyn-2(tm4844) doubie 
mutants (blue), and rde-8(ne3361 );nyn-1 (tm5004); 
nyn-2(tm4844) triple mutants (brown) relative 
to WT. 

See also Eigure S3. 



Interestingly, we found that three RDE-8 interactors are ho- 
mologous to RDE-8, including ERI-9 and two previously unstud- 
ied proteins, T23G4.3 and Y87G2A.7, which we have named 
NYN-1 and NYN-2, respectively. ERI-9 was previously shown 
to be required for 26G-RNA biogenesis (Pavelec et al., 2009). 
Consistent with the association of RDE-8 with ERI/Dicer com- 
plex components (Duchaine et al., 2006), we found that RDE-8 
also co-IPs with the SAP-domain exonuclease ERI-1 b (Figure S3; 
Kennedy et al., 2004), which interacts with both ERI-9 and Dicer 
and is required for the RdRP-dependent biogenesis of 26G- 
RNAs (Duchaine et al., 2006; Pavelec et al., 2009; Thivierge 
et al., 2012). 

NYN-1 and NYN-2 are paralogs and more similar to ERI-9 
than to other C. elegans NYN domain proteins, yet were more 
highly enriched than ERI-9 in RDE-8 IPs. To test whether 
NYN-1 and NYN-2 are required for exo-RNAi, we obtained dele- 
tion alleles of nyn-1 {tm5004 and tm5149) and nyn-2(tm4844). 
Single mutants were fully sensitive to RNAi in the germline 
lpos-1) and soma {let-2), but nyn-1;nyn-2 double mutants were 
strongly RNAi deficient in both tissue types (Figure 3B). Thus, 
NYN-1 and NYN-2 appear to act redundantly in the exo-RNAi 
pathway. 

Consistent with the RNAi defect of nyn-1 ;nyn-2 mutants, we 
found that WAGO-dependent 22G-RNAs and ERI-pathway 
small RNAs (both 26G-RNAs and 22G-RNAs) were markedly 
reduced (Figures 3C, 3D, and S3). A triple nyn-1 ;nyn-2; rde-8 
mutant did not significantly enhance the 22G-RNA defect (Fig- 
ure S3). Together, our findings suggest that NYN-1 and NYN-2 
function with RDE-8 and transposon-silencing factors to pro- 
mote the biogenesis of RdRP-dependent siRNAs in WAGO- 
and ERI-dependent silencing pathways. 



RDE-8 Localizes to P-Granule-Associated Mutator Foci 

The identification of MUT-16/RDE-6, MUT-15/RDE-5, and RDE- 
3/MUT-2 as RDE-8 interactors suggested that RDE-8 might 
localize to recently described Mutator foci: perinuclear germline 
foci that are distinct from, but often adjacent to, germline P gran- 
ules (Phillips et al., 2012). Indeed, endogenous RDE-8 protein 
was most abundant in the hermaphrodite or female germline 
(Figure 4A), and GFP::RDE-8 was primarily observed in the 
germline cytoplasm and in prominent perinuclear foci associated 
with nuclear pores in the germline (Figure 4B). Moreover, perinu- 
clear GFP::RDE-8 foci were both fewer in number than and adja- 
cent to P granules identified by RFP::PGL-1 (Figure 4C; Wolke 
et al., 2007). These data are consistent with the idea that RDE- 
8 functions along with its interactors MUT-16, MUT-15, and 
RDE-3 in Mutator foci. 

RDE-8 Is Important for Efficient RdRP Activity 

RDE-8 is required for RdRP-dependent siRNA accumulation and 
interacts with several components of Mutator foci, which are 
thought to be compartments in which RdRP activity promotes 
siRNA accumulation (Phillips et al., 2012). Flowever, RDE-8 
and RdRP interactions were not detected reproducibly in our 
co-IP studies (data not shown). To ask whether RDE-8 promotes 
RdRP activity in vitro, we used an assay in which the de novo 
synthesis of 22G-RNAs is dependent on the RdRP RRF-1, the 
p-nucleotidyltransferase RDE-3, and a template RNA that is 
not polyadenylated (Figure 5; Aoki et al., 2007). Consistent with 
the reduced level of 22G-RNAs observed in rde-8 mutants, we 
found that the activity of RdRP was reduced by ~50% in the 
rde-8(ne3361) lysate relative to the WT lysate (Figure 5 and 
Experimental Procedures). The levels of RRF-1 protein and a 
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Figure 4. GFP::RDE-8 Localizes to Perinuclear Foci in the Germline 

(A) Immunoblot analysis of RDE-8 protein from WT grown at 20° and 25°C 
(hermaphrodites), fem-1(hc17) grown at 25°C (females), fog-2(q71) (males), 
and from glp-4(bn2) animals that lack a germline at 25°C (no germline). 

(B and C) Confocal images of dissected gonads. In (B), gonads expressing 
GFP::RDE-8 (green) were stained with the MAb414 to detect nuclear pore 
complex (NPC) proteins (red) and Hoechst to detect DNA (blue). Image overlay 
at right. In (C), gonads express both GFP::RDE-8 (green) and the constitutive P 
granule component, RFP::PGL-1 (red). Image overlay at right. 



control protein (PRG-1) were similar in the rde-8(he3367j and WT 
lysates (Figure 5A). RdRP was also less active in a gfpr.rde- 
8(D76N) lysate relative to a g/p.-.rde-Sf+J lysate (Figure 5B). These 
findings suggest that RDE-8 is important for efficient RdRP 
activity. 

RDE-8 Interacts with Target mRNA and Requires RDE-1 
and Trigger dsRNA 

To ask whether RDE-8 interacts with the target mRNA during 
RNAi, we immunoprecipitated GFP::RDE-8 from worms ex- 
posed to dsRNA targeting sel-1 or a negative contro, and then 
used RT-qPCR to detect regions of the sel-1 mRNA (Figure 6A). 
We failed to detect a significant enrichment of sel-1 mRNA in 
GFP::RDE-8(WT) IP experiments (Figure 6B). This lack of enrich- 



ment could result from GFP::RDE-8(WT) binding only transiently 
to the sel-1 mRNA and then perhaps rapidly cleaving and 
releasing it. We therefore also tested for RNA binding by the 
catalytically inactive GFP::RDE-8(D76N). Strikingly, we found 
that RDE-8(D76N) specifically captured the sel-1 transcript 
when animals were exposed to sel-1 dsRNA (Figure 6A). Inter- 
estingly, GFP::RDE-8(D76N) IP enriched similar levels of sel-1 
mRNA from both upstream and downstream of the dsRNA 
trigger region (regions 1 and 4, Figure 6A), suggesting that 
GFP::RDE-8(D76N) associates with an intact sel-1 transcript, 
one that has not already been cleaved by the primary Argonaute 
RDE-1 (see below and Discussion). 

We next examined the genetic requirement for target mRNA 
recognition by GFP::RDE-8(D76N). The Argonaute RDE-1 is 
required for the initiation of RNAi and is loaded with primary 
siRNAs processed from dsRNA by Dicer (Yigit et al., 2006). 
Dicer-dependent primary siRNAs are present in rde-8 mutants 
(Figure 2C), and, based on affinity capture experiments using 
2'-0-methylated RNA oligos, they are loaded onto functional 
RDE-1 complexes (Figure S4). Consistent with the idea that 
these primary RDE-1 /siRNA complexes are required for RDE-8 
to interact with the target, we found that GFP::RDE-8(D76N) 
failed to capture target mRNA in the rde-1(ne300) mutant 
background (Figure 6B). The ability of RDE-1 to promote RDE- 
8 binding to the target is likely to be independent of RDE-1 cat- 
alytic activity because the catalytic mutant RDE-1 (AAA) protein 
promotes secondary siRNA biogenesis and silencing triggered 
by dsRNA (Pak et al., 2012; Steiner et al., 2009). As expected, 
we found that the ability of RDE-1 (AAA) to promote sel-1 (RNAI) 
is dependent on RDE-8 catalytic activity, sel-1 (RNAi) dramati- 
cally reduced sel-1 mRNA levels in rde-1(AAA) animals, but 
not in rde-1(AAA); gfp::rde-8(D76N) animals (Figure S4). Thus, 
RDE-8 functions downstream of target recognition by RDE-1 . 

We next asked whether RdRP is required for RDE-8 to bind 
target mRNA. To remove RdRP activity, it was necessary to 
use the double-mutant rrf-1(pk1417) glp-4(bn2) background, 
in which the somatic RdRP rrf-1 is deleted and the germline 
RdRP ego-1 (Smardon et al., 2000) is not expressed due to the 
absence of germline at 25°C in the temperature-sensitive, germ- 
line-deficient mutant glp-4(bn2) (Beanan and Strome, 1992). 
Conditional alleles of ego-1, which is an essential gene, do not 
exist. We found that depletion of RdRP failed to block the asso- 
ciation of GFP::RDE-8(D76N) with sel-1 target mRNA (Figure 6B). 
Thus, RDE-8 recognizes the target transcript upstream of RdRP 
and secondary small RNA amplification. 

Finally, we asked whether RDE-8 interactors are required for 
RDE-8 to bind the target. In mut-15(tm1358), mut-16(tm3748), 
or nyn-1 (tm5004);nyn-2(tm4844) mutants, enrichment of sel-1 
mRNA by GFP::RDE-8(D76N) RIP was reduced to 22%-42% 
of the WT level (Figure 6B). By contrast, we found that 
GFP::RDE-8(D76N) RIP in therc/e-3(he3370) mutant background 
enriched sel-1 mRNA to 77% of WT levels (Figure 6B). These 
data suggest that RDE-8 interactors facilitate or stabilize RDE- 
8 binding to the target mRNA and function together upstream 
of RdRP. The p-nucleotidyltransferase RDE-3, however, appears 
to be less important for RDE-8 binding to the target than for 
RdRP activity, suggesting that RDE-3 may function between 
RDE-8 and RdRP. 
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Figure 5. RDE-8 Promotes RdRP Activity In Vitro 

(A) Top: in vitro RdRP activity assayed in WT, rde-8{ne3361), rrf-1{pk1417), or rde-3(ne3370) lysates in the presence (- 1 -) or absence {-) of in-vitro-transcribed and 
capped RNA template. “pA” denotes that a polyA stretch was added to the 3' end of the template. Incubation times are indicated in minutes (min). Oligonucleotide 
size markers are indicated. “A” represents uridylated template RNA (Aoki et al., 2007). Bottom: Immunoblot analysis of RRF-1, RDE-8, and PRG-1 (loading 
control) protein levels in lysates used for the RdRP assay. 

(B) In vitro RdRP activity assayed in se!-1 dsRNA-fed, gfp::rde-8(WT) and gfp::rde-8(D76N) lysates in the presence (-i-) of in-vitro-transcribed and capped RNA 
template, as in (A). 




RDE-8 Is Required for the Accumulation of 3' Uridylated 
mRNA Cleavage Products 

Upstream components of the C. elegans RNAi machinery must 
somehow generate mRNA-derived templates for the RdRP- 
dependent amplification of the silencing signal. In Tetrahymena, 
efficient RdRP recruitment requires 3' uridylation of RNA tem- 
plates (Lee et al., 2009; Talsky and Collins, 2010). We therefore 
asked whether target mRNA fragments tailed with untemplated 
residues accumulate during RNAi in C. elegans and, if so, 
whether or not these products are dependent on RDE-8 activity. 
To do this, we used 3' RACE to search for mRNA cleavage prod- 
ucts in animals exposed to dsRNA targeting the sel-1 transcript 
and in control animals not exposed to dsRNA. A 3' linker was 
ligated to the RNA to provide an anchor for first-strand cDNA 
synthesis and subsequent PCR reactions. To amplify potential 
5' sel-1 mRNA cleavage products, and not the ingested sel-1 
dsRNA, we amplified each cDNA library using a series of nested 
sel-1 mRNA-specific primers to generate 3'-RACE products with 
5' ends that lie 40, 30, 20, and 12 nt upstream of the dsRNA 
target sequence. The nested cDNAs were pooled and amplified, 
and 3'-RACE products were gel purified and deep sequenced 
(Figure S5). 



Consistent with the idea that RDE-8 activity is important for 
target cleavage, we found that target-mRNA fragments tended 
to be higher in gfp::rde-8(+) worms than in gfp::rde-8(D76N) 
worms after 3 and 7 hr of RNAi (Figure S6). Notably, we found 
that target-mRNA fragments containing untemplated uridines 
(but not untemplated A, C, or G residues) increased in gfpr.rde- 
8(+) rescued rde-8(ne3361) animals exposed to RNAi by 5.5-fold 
after 3 hr and by 20-fold after 7 hr (Figure 7A). These time points 
correspond to the interval where 22G-RNA production is first 
detectable during RNAi (Figures 2C and 7B). By contrast, the levels 
of uridylated sel-1 fragments remained low in rde-8(ne3361); 
gfp::rde-8(D76N) transgenic worms throughout the sel-l(RNAI) 
time course and in gfp::rde-8(+) rescued animals that were not 
exposed to se!-1 dsRNA (Figures 7A-7C). In WT worms express- 
ing endogenous RDE-8, we found that fragments modified with 
2 untemplated uridine residues were the most prominent species 
after 3 hr of sel-1 (RNAI), but fragments with 1 , 2, 3, or 4 untem- 
plated uridines were observed at similar levels after 7 hr(Figure7B). 
In the gfp::rde-8(+) strain, however, mono-uridylated fragments 
were by far the most prominent species (Figure 7B). 

To examine the possible relationship between 3' uridylation 
and RdRP function, we cloned and deep sequenced 22G-RNAs 
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Figure 6. RDE-8 Binds Target mRNA Downstream of RDE-1 

(A and B) Bar graphs depicting RT-qPCR results of se/-7 mRNA levels in GFP 
immunoprecipitates divided by levels in control IgG precipitates. All strains 
assayed were rde-8(ne3361) and transgenic for gfp::rde-8(D76N), except for 
one strain that was transgenic for gfp::rde-8(+) (orange bar in B). In (A), a 
schematic diagram of the sel-1 locus shows the region targeted by dsRNA and 
the four regions assayed by RT-qPCR. In (A), absence of sel-1 dsRNA (-) 
served as a specificity control for each region assayed. In (B), all strains were 
exposed to sel-1 dsRNA and were assayed for region II. 

See also Figure S4. 

from each time point in the gfp::rde-8(+) and gfp::rde-8(D76N) 
transgenic strains. When we mapped the 3' uridylation and 
22G-RNA initiation sites, we observed at ieast one 22G-RNA 
peak that initiates 5' and proximai to the major uridyiation sites 
in gfp::rde-8(WJ), but not gfp::rde-8(D76N), transgenic worms 
(Figure 7B). In addition, the 3' uridylation products were observed 
at 3 hr, whereas the corresponding 22G-RNAs were not detected 
untii 7 hr, suggesting that the accumulation of uridyiated sel-1 
fragments precedes the accumuiation of the corresponding 
22G-RNAS at these sites. 

Finaiiy, we examined whether the accumuiation of uridyiated 
target mRNA fragments depends on RdRP and RDE-3 activity. 
Depicting RdRP activity, using an rrf-1(pk1417) glp-4(bn2) dou- 
bie-mutant background as described above, we found that 
sel-1 mRNA fragments were uridyiated proximai to the dsRNA 
trigger region in rde-8(ne3361); gfp::rde-8(+) animais, but uridy- 



lation was reduced in rde-8(ne3361); gfp::rde-8(D76N) animais 
and in rde-3(ne3370) animais (Figure 7D). In these experiments, 
the pattern of uridyiation differed from that observed in the time 
course above (perhaps owing to the lack of germline or RdRP). 
These data suggest that, together, RDE-8 and RDE-3 act up- 
stream of RdRP to promote the uridylation of 5' sel-1 fragments 
that could function as RdRP templates. 

DISCUSSION 

The RdRP-dependent amplification of secondary siRNAs is 
essential for robust silencing during RNAi in C. elegans. Thus, 
target mRNA destruction must be managed during RNAi so as 
to preserve mRNA sequences that serve as templates for 
RdRP amplification. In this study, we have shown that RDE-8 
encodes a ribonuclease that interacts with target mRNA down- 
stream of the Argonaute RDE-1 but upstream of RDE-3 and 
RdRP. RDE-8 forms a complex with RDE-3 in vivo and is 
required for the uridylation of 5' fragments of the target mRNA 
and for the amplification of secondary siRNAs by RdRP. 

It is well known that a number of viral RdRPs prefer to initiate 
de novo transcription using GTP (Kao et al., 2001). Furthermore, 
the Neurospora RdRP QDE-1 , a homolog of worm RdRPs, pre- 
fers to initiate de novo transcription with GTP and to produce 9 
to 21 nt small RNAs that are distributed across the template 
(Makeyev and Bamford, 2002). In C. elegans, RdRPs prefer to 
initiate transcription at a YG motif (as viewed from the antisense 
strand), where G is the first nucleotide of the 22G-RNA preceded 
by a pyrimidine (Y), which is similar to the YR motif preferred by 
RNA polymerase II for transcription initiation, where a purine (R) 
is the first nucleotide of the transcript (Gu et al., 2012, and refer- 
ences therein). 

Our findings suggest a model (Figure 7E) whereby mRNAs are 
initially recognized but not cleaved by RDE-1 . Instead, RDE-1 re- 
cruits a complex containing RDE-8, which cleaves the target 
mRNA exposing a 3' end that can be uridyiated by the p-nucle- 
otidyltransferase homolog RDE-3. Thus, RDE-8 cleavage and 
RDE-3-dependent 3' uridylation of cleavage products may pro- 
vide a signal or platform to recruit the RdRP complex. RdRP 
could then, in turn, initiate de novo transcription from internal C 
nucleotides near the 3' end of the uridyiated template. This 
model is consistent with previous work showing that 22G-RNA 
amplification is highest proximal to the dsRNA trigger region (Si- 
jen et al., 2001, 2007; Pak and Fire, 2007). RDE-8 may function 
similarly in the ERI pathway, after ERGO-1 targeting, and may 
also function along with RDE-3 downstream of WAGO-mediated 
target recognition. 

The nuclease activity of recombinant RDE-8 requires con- 
served aspartic acid residues that map to the catalytic site of 
Zc3h12a (Matsushita et al., 2009; Xu et al., 2012). Zc3h12a de- 
stabilizes the mRNAs of immune-related factors, including IL-6 
and IL-12p40, by directly binding and cleaving 3' UTRs (Mat- 
sushita et al., 2009). Zc3h1 2a was also shown to negatively regu- 
late mlRNA expression by cleaving the terminal loop of pre-miR- 
NAs (Suzuki et al., 2011). The CCCFI domain of Zc3h12a is 
required for RNA-binding activity in vitro and for cleavage in vivo, 
but not in vitro (Suzuki et al., 201 1). RDE-8 contains no recogniz- 
able RNA-binding domain, and we only detected target binding 
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when the conserved catalytic residues of RDE-8 were mutated, 
suggesting a transient or indirect interaction between RDE-8 
and the target mRNA. Perhaps consistent with the latter possibil- 
ity, the Interaction with target mRNA was partially dependent on 
factors that Interact with RDE-8 and required the Argonaute 
RDE-1 , which may directly or Indirectly recruit the RDE-8 com- 
plex to the target mRNA. 

Among the factors that Interact with RDE-8, we Identified three 
homologs of RDE-8, Including ERI-9 and the closely related 
redundant genes NYN-1 and NYN-2. Previous work has shown 
that ERI-9 Is a component of the ERI-pathway sIRNAs expressed 
In embryos (Duchaineetal., 2006; Pavelec etal., 2009). Our small 
RNA data Indicate that RDE-8, NYN-1 , and NYN-2 function along 
with ERI-9 In the ERI pathway. Remarkably, ERI-9, NYN-1 , and 
NYN-2 lack predicted active-site residues and are thus unlikely 
to encode functional nucleases. Nevertheless, these factors 
were required for ERI-pathway 26G-RNA and 22G-RNA biogen- 
esis, and NYN-1 and NYN-2 were also required for RDE-8(D76N) 
to Interact with the target mRNA during RNAi and for RDE-8 to 
localize to Mutator foci. Together, these results suggest a struc- 
tural rather than catalytic role for ERI-9, NYN-1 , and NYN-2 In the 
RDE-8 complex. Interestingly, although the catalytic activity of 
RDE-8 was required for 22G-RNA biogenesis. It was not required 
for 26G-RNA accumulation. Perhaps RDE-8 and ERI-9 are struc- 
turally Important for a functional ERI complex, promoting RRF-3- 
dependent 26G-RNA biogenesis. A structural role for rlbonucle- 
ases Is well documented; the eukaryotic PM/ScI complex, or 
exosome, for example. Is composed of multiple RNasePH family 
members that lack catalytic capacity (Jain, 2012). Detailed struc- 
ture-function studies are necessary to sort out the role of these 
and six other RDE-8 homologs In C. elegans. Our findings, along 
with previous work on Zc3h12a, suggest that members of this 
conserved nuclease family share ancient and fundamental roles 
In Immunity. 

How Does RDE-8 Function during RNAi? 

Several studies suggest that direct Argonaute-medlated target 
mRNA cleavage Is not required for mRNA silencing during 
RNAI (and related pathways) In C. elegans. For example, when 
engineered to contain mutations in conserved metal-coordi- 
nating residues needed by other Argonautes for RNA cleavage, 
RDE-1 and PRG-1 could nevertheless initiate RdRP recruitment 
and gene silencing in the dsRNA- and piRNA-initiated pathways, 
respectively (Bagijn et al., 2012; Lee et al., 2012; Pak and Fire, 
2007; Shlrayama et al., 2012; Steiner et al., 2009). Moreover, 
all 12 of the downstream WAGO Argonautes, required for 
silencing in both of these pathways, encode proteins that lack 
key residues required for target cleavage (Ylgit etal., 2006). Simi- 
larly, Rhodobacter sphaeroides Argonaute is not a functional 
endonuclease, but promotes the silencing of foreign genetic ma- 
terial (Olovnikovetal., 201 3). These observations suggest that. In 
the absence of endonuclease activity, Argonautes can promote 
silencing by guiding other nucleases or turnover pathways to 
their targets (Huntzinger and Izaurralde, 2011; Urn etal., 2014). 
Our finding that target mRNA cleavage products with untem- 
plated uridine residues accumulate during RNAI In an RDE-8- 
dependent manner raises the possibility that RDE-8 (or other 
components of the RDE-8 complex) may provide cleavage activ- 



ity important for RNAi. Our findings place RDE-8 downstream of 
primary Argonautes (RDE-1, ERGO-1 , and PRG-1) and upstream 
or at the same level as RdRP In each of these pathways. 

Several recent studies have identified factors that appear to 
function between RDE-1 and RdRP, including RDE-1 0, RDE- 
11, and RDE-1 2 (Shirayama et al., 2014; Yang et al., 2012, 
2014; Zhang et al., 2011). Based on IP experiments presented 
here and In the aforementioned studies, RDE-8 does not Interact 
with these factors. Furthermore, RDE-1 0, RDE-1 1, and RDE-1 2 
are specifically required for WAGO-assoclated 22G-RNAs 
dependent on RDE-1 and ERGO-1 (Shlrayama et al., 2014; 
Yang et al., 2012, 2014; Zhang et al., 2011), whereas RDE-8 is 
more broadly required for WAGO-associated 22G-RNAs. Thus, 
RDE-8 may function at a distinct step (or multiple steps) in the 
RNAi pathway. 

Finally, RDE-8 also promotes the accumulation of WAGO- 
assoclated 22G-RNAS that are Independent of known primary 
Argonautes and thus appear to function in self-enforcing trans- 
generational silencing pathways. It is tempting to speculate 
that WAGOs have evolved catalytic mutations so that they do 
not cleave the target mRNA within the region required to tem- 
plate de novo synthesis of the sIRNAs that successfully guided 
them to the target. Instead, WAGOs may recruit secondary nu- 
cleases that cleave the target mRNA 3' of where their guide 
RNAs engage the target. The p-nucleotidyltransferase homolog 
RDE-3 could then modify this cleavage product to stabilize It 
or to recruit RdRP to regenerate the 22G-RNA, thereby propa- 
gating self-enforcing silencing signals (Figure 7E). In contrast, 
perhaps RDE-1 recruits RDE-8 more broadly along the target 
mRNA, producing multiple cleavage products that can serve to 
generate new RdRP-derived sIRNAs (Figure 7E). 

Conclusion 

RDE-8 homologs regulate post-transcriptlonal silencing In Innate 
Immune pathways in both worms and mammals. Our findings 
suggest that RDE-8 homologs might function as nucleases or 
as structural subunits of silencing complexes that promote 3' ur- 
Idylatlon of substrates. Uridylatlon plays diverse and important 
roles In small RNA pathways (Lee et al., 2014; Talsky and Collins, 
201 0). In plants and animals, 3' uridylatlon promotes the turnover 
of miRNAs and siRNAs (Ameres et al., 201 0; Ibrahim et al., 201 0), 
and mIRNA- or sIRNA-directed cleavage of mRNA results In 3' 
uridylatlon of the 5' cleavage product and rapid mRNA decay 
(Shen and Goodman, 2004). Moreover, Kim and colleagues 
have recently shown that 3' uridylatlon enhances mRNA decay 
of deadenylated mIRNA targets (Urn et al., 2014). Perhaps 3' ur- 
idylatlon of transcripts processed by RDE-8-related nucleases is 
a conserved signal for post-transcriptlonal silencing. It will there- 
fore be Interesting to learn whether RDE-8-related proteins func- 
tion broadly in uridylation-dependent pathways that regulate 
gene expression. 

EXPERIMENTAL PROCEDURES 
Genetics 

C. elegans culture and genetics were performed essentially as 
described (Brenner, 1974). Uniess otherwise noted, the WT strain in this 
study is the Bristol N2 strain. Alleles used in this study listed by 
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Figure 7. RDE-8 Promotes Target mRNA Cleavage and 3' Uridylation Adjacent to Sites of Secondary siRNA Initiation 

(A) Bar graphs showing the percentage of untemplated residues detected by 3' RACE at 3' ends of sel-1 mRNA fragments during sel-l(RNAi). WT or rde- 
8(ne3361) animals were exposed to control dsRNA (-) or sel-1 dsRNA for 1, 3, or 7 hr. 

(legend continued on next page) 
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chromosome: LGI: mut-16(tm3748, ne322), rde-3(ne3370), rrf-1(pk1417), glp- 
4 (bn2), nyn-2(tm4844)\ LGI!: neSi24[gfp::rde-8(+), cb-unc-119(+)], neSi25 
[gfp::rde-8(D7GN), cb-unc-119(+)J; LGIV: rde-8(ne336t tm2192, tm2252), 
fem-1(hc17), nyn-1(tm5004, tm5149)] LGV; rde-1(ne300), mut-15(tm1358), 
fog-2 (q71). The genetic screen and transgenic procedures are detailed in 
Extended Experimental Procedures. 

Recombinant RDE-8 Protein Purification and Ribonuclease Assay 

rde-8 cDNAs (WT and D76N) were cloned into a pET expression vector. 
Expression was induced in BL21(DE3) cells at 16°C overnight. 6His-RDE- 
8 fusion proteins were extracted in 50 mM Tris (pH 7.5), 50 mM NaCI by 
sonication. Soluble 6His-RDE-8 fusion proteins were purified by anion ex- 
change (Q-Sepharose), nickel-chelate, and gel-filtration chromatography. 
Proteins purity was verified by SDS-PAGE. Nuclease assays were per- 
formed as described in Matsushita et al. (2009), but the buffer was adjusted 
to pH 6.0. 

RNA Immunoprecipitation 

Worms were harvested as adults and extracted in 1 pellet volume of homog- 
enization buffer (25 mM HEPES-KOH (pH 7.5), 10 mM potassium acetate, 
2 mM magnesium chloride, 0.1% NP-40, 110 mM potassium chloride, 
200 U/ml SUPERaseln [Ambion]). 20 mg of lysate was incubated with 20 iig 
of anti-GFP antibody (Wako) or IgG control for 1 hr at 4°C. Immune complexes 
were captured with Protein A/G-Sepharose beads (Santa Cruz Biotechnology) 
and washed with homogenization buffer. RNA was extracted with Trizol (MRC, 
Inc.), and cDNA was prepared using Superscript III (Life Technologies) and a 
mixture of sel-1 and act-3 primers. The level of sel-1 mRNA was measured 
by quantitative PCR relative to act-3. Primers are listed in Extended Experi- 
mental Procedures. 

MudPIT 

MudPIT analysis was performed essentially as described (Conine et al., 2013). 
See Extended Experimental Procedures for details. 

Small RNA Cloning and Data Analysis 

Small RNAs between 1 8 and 40 nt were gel purified and cloned after treatment 
with CIP (NEB) and PNK (NEB) or without pretreatment (direct cloning) as 
described (Gu et al., 2009; Vasaleet al., 2010). Libraries were sequenced either 
on an lllumina GAN or HiSeq instrument in the UMass Medical School Deep 
Sequencing Core. Sequences were aligned to the worm genome (WS235) 
using Bowtie 0.12.9 (Langmead et al., 2009). Custom Python scripts used to 
process and analyze the data are available upon request. See Extended 
Experimental Procedures for details. 

3' RACE Sequencing 

Total RNA was extracted from worms fed with bacteria expressing sel-1 or 
control dsRNA. RNAs were ligated to the activated 3' linker, miR Linker 1 
(IDT). Ligation products were reverse transcribed using a primer specific to 
the 3' linker sequence. Libraries were generated by nested PCR and 
sequenced on an lllumina HiSeq 2000 instrument. See Extended Experimental 
Procedures for details. 

RdRP Assay 

The in vitro RdRP assay (Aoki et al., 2007) was performed four times. Lysates 
were prepared from synchronized adult worms. Labeled products separated 
on a 15% polyacrylamide/8 M urea gel were detected and quantified using a 



Storm phosphorimager and ImageQuant software (GE Healthcare). The rate 
of ®^P-UTP incorporation into siRNAs by RdRP was calculated as the slope 
of time course data (e.g.. Figure 5) fitted to a linear function. 

Fluorescence Imaging 

Gonads of adult rde-8(ne3361);gfp::rde-8(+) animals were extruded by decap- 
itating worms on a slide. Extruded gonads were fixed with formaldehyde, per- 
meabilized by freeze-cracking, and stained as described (Phillips et al., 2009). 
Confocal images were acquired using a Zeiss LSM510 Meta Confocal micro- 
scope and Zen 2008 (Zeiss) software. 

ACCESSION NUMBERS 

lllumina data are available from GEO under the accession number GSE59300. 
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figures, and three tables and can be found with this article online at http:// 
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SUMMARY 

The barrier to curing HlV-1 is thought to reside pri- 
marily in CD4*^ T cells containing silent proviruses. 
To characterize these latently infected cells, we stud- 
ied the integration profile of HIV-1 in viremic progres- 
sors, individuals receiving antiretroviral therapy, and 
viremic controllers. Clonally expanded T cells repre- 
sented the majority of all integrations and increased 
during therapy. However, none of the 75 expanded 
T cell clones assayed contained intact virus. In 
contrast, the cells bearing single integration events 
decreased in frequency over time on therapy, and 
the surviving cells were enriched for HIV-1 integra- 
tion in silent regions of the genome. Finally, there 
was a strong preference for integration into, or in 
close proximity to, Alu repeats, which were also en- 
riched in local hotspots for integration. The data indi- 
cate that dividing clonally expanded T cells contain 
defective proviruses and that the replication-compe- 
tent reservoir is primarily found in CD4*^ T cells that 
remain relatively quiescent. 

INTRODUCTION 

Despite effective therapy, HIV-1 can persist in a iatent state as an 
integrated provirus in resting memory CD4'^ T ceiis (Chun et ai., 
1997; Finzi et ai., 1997; Wong et ai., 1997). The iatent reservoir 
is estabiished very early during infection, (Chun et ai., 1998), 
and because of its long half-life of 44 months (Finzi et ai., 
1999), it is the major barrier to curing HIV-1 infection (Siliciano 
and Greene, 201 1). 

The HIV-1 latent reservoir has been difficult to define, in part 
because reactivation of latent viruses is difficult to induce and 
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to measure. Viral outgrowth assays underestimate the size of 
the reservoir, while direct measurements of integrated HIV-1 
DNA overestimate the reservoir because a large fraction of the 
integrated viruses are defective (Ho et ai., 2013). Although the 
latent reservoir remains to be completely defined, establishing 
the reservoir requires intact retroviral integration into the genome 
and subsequent transcriptional silencing (Siliciano and Greene, 
201 1 ). Whether or not the genomio location of the integration im- 
pacts on latency is debated (Jordan et ai., 2003; Jordan et ai., 
2001; Sherrill-Mix et ai., 2013). However, HIV integration into 
the genome is known to favor the introns of expressed genes 
(Han et ai., 2004), some of which, like BACH2 and MKL2, carry 
multiple independent HIV-1 integrations in different individuals 
and are considered hotspots for integration (Ikeda et ai., 2007; 
Maldarelli et ai., 2014; Wagner et ai., 2014). However, there is 
currently no precise understanding of the nature of these hot- 
spots or why they are targeted by HIV-1 . 

Viremia rebounds from the latent reservoir after interruption of 
long-term treatment with combination anti-retroviral therapy 
(cART). When it does, it appears to involve an increasing pro- 
portion of monotypic HIV-1 sequences, suggesting the prolifer- 
ation of latently infected cells (Wagner et ai., 2013). Based on 
this observation and the finding that a subset of cells bearing 
integrated HIV-1 undergoes olonal expansion in patients 
receiving suppressive anti-retroviral therapy, it has been pro- 
posed that the clonally expanded cells play a critical role in 
maintaining the reservoir (Maldarelli et ai., 2014; Wagner 
et al., 2014). 

To obtain additional insights into the regions of the genome 
that are favored by HIV-1 for integration and the role of clonal 
expansion in maintaining the reservoir, we developed a single 
cell method to identify a large number of HIV-1 integration sites 
from treated and untreated individuals, including “viremic con- 
trollers” who spontaneously maintain viral loads of <2,000 RNA 
copies/ml and “typical progressors” who display viral loads 
>2,000 RNA copies/ml. 
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Figure 1. HIV-1 Integration Libraries 

(A) Diagram of integration library construction. 

(B) Table of unique integrations identified in viremic controllers (C), viremic untreated progressors (V), and treated progressors (T). 

(C) Proportion of integrations (Int) that are in genic or intergenic regions. 

(D) Proportion of genic integrations located in introns. 

(E) Proportion of integrations in genes with high, medium, or low expression, p values refer to proportion of integrations in highly expressed genes. 

(F) Transcriptional orientation of integrated HIV-1 relative to host gene, ns, not significant. 

*p < 0.05; **p < 0.01; ***p < 0.0001 using two-proportion ztest. See also Figure SI. 



RESULTS 

Integration Library Construction 

Twenty-four integration libraries were constructed from CD4'^ 
T cells from 13 individuals: three provided longitudinal samples 
before and after (0. 1-7.2 years) initiation of therapy; four were un- 
treated; two were treated; and four were viremic controllers (Table 
SI). Patients were grouped into three categories based on viral 
loads and therapy: (1 ) viremic progressors were untreated individ- 
uals with viral loads higher than 2,000 viral RNA copies/ml of 
plasma; (2) progressors were treated individualswhose initial viral 
loads were higher than 2,000 viral RNA copies/ml before therapy; 
(3) controllers were individuals who maintain low viral loads spon- 
taneously in the absence of therapy (<2,000 viral RNA copies/ml). 
The frequency of latently infected, resting CDA* T cells in our pa- 
tients was similar to that reported by others as measured by quan- 
titative viral outgrowth assay (Table SI and Laird et al., 2013). 

Libraries were produced from genomic DNA by a modification 
of the translocation-capture sequencing method that we refer to 
in this paper as integration sequencing (Figure 1A) (Janovitz 
et al., 2013; Klein et al., 201 1). Virus integration sites were recov- 
ered by semi-nested ligation-mediated PCR from fragmented 
DNA using primers specific to the FIIV-1 3' LTR (Table S2). 



PCR products were subjected to high-throughput paired-end 
sequencing, and reads were aligned to the human genome. 
Since sonication is random, it produces unique linker ligation 
points that identify the specific integration events in each in- 
fected CD4'^ T cell, which allows both single-cell resolution and 
identification of expanded clones of cells with identical integra- 
tions (Berry et al., 2012 and Figure 1A). Thus, integration 
sequencing can enumerate both the number of integration sites 
and the number of infected cells. 

A total of 6,71 9 unique virus integration sites were determined 
(Table S3): 873 unique integrations in viremic controllers; 987 in- 
tegrations in untreated progressors; and 4,859 integrations in 
treated progressors (Figure 1 B). 

Integrations Are Enriched in Introns of Highly Expressed 
Genes 

We analyzed the genomic location of the integration sites ob- 
tained from viremic controllers and untreated and treated pro- 
gressors and compared our results to published data obtained 
from FilV-1 -infected individuals (Flan et al., 2004; Flo et al., 
2013; Ikeda et al., 2007; Schroder et al., 2002). In agreement 
with the work of others, the majority of integration sites in each 
group are genic (Figure 1C and Figure SI A). Moreover, 



Cell 760, 420-432, January 29, 2015 ©2015 Elsevier Inc. 421 






Cell 



ns 




■ Clonally Expanded Int 
I I Single Int 




■ Clonally Expanded Cells 
I I Single Cell. 




Figure 2. Identification of Clonally Ex- 
panded Cells Bearing Integrated HIV-1 

(A) Proportion of viral integrations (Int) that are 
clonally expanded, as identified by the same 
integration site with multiple shears in controllers 
(C) or viremic (V) or treated progressors (T). 

(B) Proportion of infected cells deriving from clonal 
expansion. 

(C) Proportion of clonally expanded (CE) and single 
(S) viral integrations in genic or intergenic regions. 

(D) Proportion of clonally expanded and single viral 
integrations in introns. 

(E) Proportion of clonally expanded or single viral 
integrations in genes with high, medium, or low 
expression, p values refer to proportion of in- 
tegrations in highly expressed genes. 

(F) Flow cytometry sorting strategy to identify 
004"^ T cell subsets. CM, TM, and EM cell subsets 
were identified based on their CD45RA, CCR7, 
and CD27 expression. Shown is one representa- 
tive sort. 

(G) Proportion of viral integrations (Int) that are 
clonally expanded, as identified by the same 
integration site with multiple shears in sorted CD4'^ 
T cell subsets from patient 9. 

(H) Proportion of infected cells deriving from clonal 
expansion in sorted 004"^ T cell subsets from pa- 
tient 9. 
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integrations are found more frequently in the introns of highly ex- 
pressed genes, and there is a slight bias for viral orientation that 
leads to convergent transcription (Figures ID, 1E, and 1F and 
Figures S1B-S1E) (Mitchell et al., 2004; Schroder et al., 2002). 
Thus, the general features of integrations defined by integration 
sequencing are similar to those obtained by others. 

Although the differences between groups were small in 
magnitude, they were significant in that treated progressors 
had a smaller proportion of integrations in genic regions (p < 
0.0001 and p < 0.0001, respectively) and in highly expressed 
genes (p < 0.0001 and p < 0.0001 , respectively) when compared 
to viremic controllers and untreated progressors (Figures 1C 
and IE). Conversely, the proportion of viral integrations in 
genes expressed at lower levels was increased in treated 
progressors compared to viremic controllers and untreated 
progressors (p = 0.002 and p < 0.0001 , respectively). Viremic 



ns, not significant. ***p < 0.0001 using two-pro- 
portion z test. See also Figure S2. 



controllers and treated progressors were 
not significantly different from each other 
in terms of the level of expression of the 
genes at the sites of integration (Fig- 
ure 1 E). Thus, therapy is associated with 
a relative decrease in the number of cells 
with viral integrations in highly expressed 
genes. 



Identification of Clonally Expanded 
Cells Containing Integrated HIV-1 

Since we shear DNA ends randomly to 
produce our libraries and by paired-end 
sequencing can determine the precise site of both the integration 
and sheared end, we infer that identical integrations with unique 
sheared ends arise from clones of expanded cells (Figure 1A). In- 
tegrations can therefore be classified as clonally expanded (i.e., 
identical integrations with distinct sheared ends, deriving from 
the clonal expansion of an original unique, single integration 
event) or single integrations (i.e., unique integration site with a 
single sheared end). 

Clonally expanded viral integrations were present in all individ- 
uals irrespective of therapy or viremia (Table S3), ffowever, the 
proportion of clonally expanded viral integrations is significantly 
lower in viremic controllers (30%) and viremic progressors (27%) 
than in treated progressors (40%) (p < 0.0001 and p < 0.0001 , 
Figure 2A and Figure S2A). Although the size of individual clones 
varied from 2 to 295 cells (Figure S2B), the relative increase in 
clonally expanded integrations during therapy consistently 
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translated into an increase in the number of infected cells that 
derive from expanded clones (Figure 2B). The percentage of 
cells containing clonally expanded HIV-1 integrations was similar 
in untreated progressors (78%) and controllers (79%), but it was 
significantly increased in treated progressors (90%) (p < 0.0001 
and p < 0.0001, Figure 2B and Figure S2C). Thus, therapy is 
associated with an increase in the frequency of clonal HIV-1 in- 
tegrations and infected clonally expanded cells. 

To determine whether the position of viral integration in the 
genome correlates with clonal expansion, we compared the 
location of genomic clonally expanded to single integrations. 
Both types of integrations favored genes and their introns (Fig- 
ures 2C and 2D and Figures S2D and S2E). However, the propor- 
tion of clonally expanded integrations in intergenic regions was 
greater than that of single integrations (Figure 2C, p < 0.0001). 
Moreover, of the integrations in genes, single integrations were 



Figure 3. Clonally Expanded Viral Integra- 
tions Increase and Single Integrations 
Decrease during Therapy 

Graphs show data from patients 1 (blue), 2 (red), 
and 3 (green) from longitudinal time points (Table 
SI). Time was normalized from 0 to 1 (727 days 
pre-therapy to 2617 days post-therapy). Dotted 
line at t = 0.21 marks therapy initiation. Trendline 
was determined by linear regression model. Solid 
lines indicate significant change in proportion of 
events; dashed lines indicate insignificant change 
in proportion of events. 

(A) Proportion of clonally expanded viral in- 
tegrations (Int). 

(B) Proportion of single viral integrations. 

(C) Proportion of genic clonally expanded viral 
integrations. 

(D) Proportion of genic single viral integrations. 

(E) Proportion of intergenic clonally expanded viral 
integrations. 

(F) Proportion of intergenic single viral integrations. 
See also Figure S3. 



more likely to be found in highly ex- 
pressed genes than clonal integrations 
(Figure 2E, p < 0.0001, and Figure S2F). 
Thus, cells harboring viral integrations in 
intergenic regions and genes that are ex- 
pressed at lower levels are more likely to 
be clonally expanded. 



Large Expanded Clones Are Found 
in Memory Cells 

Central memory cells are thought to be 
the major source of the FIIV-1 reservoir 
(Chomont et al., 2009). To investigate 
the nature of the cells that comprise the 
expanded clones, we performed virus 
integration sequencing on genomic DNA 

from sorted central, transitional, and 

effector memory CD4'^ T cells (Figure 2F). 
In both individuals studied, all three sub- 
sets of CD4'^ T cells contained expanded 
clones (Figures 2G and 2FI and Figures S2G and S2FI). Thus, 
central, transitional, and effector memory T cells, all of which 
have undergone antigen-sfimulated cell division, harbor the 
expanded clones of FflV-1 integrants. 

Clonally Expanded Integrations Increase after Therapy 

The proportion of clonally expanded viral infegrations is 
increased in treated progressors (Figure 2A and Wagner et al., 
2014). To further examine the effect of therapy on clonal expan- 
sion, we analyzed longitudinal samples from three typical pro- 
gressors before and during therapy (Table SI). We found an 
increase in the number of clonally expanded integrations 
throughout the treatment period of up to 7.2 years in two of the 
three patients (Figure 3A, p = 0.017, and Figure S3A) as well as 
an increase in the number of cells that contained clonally 
expanded viral integrations (Figure S3B). Correspondingly, there 
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was also an overall decrease overtime in single integrations (p = 
0.017), with a half-life of 127 months assuming a non-linear 
regression model for one-phase decay (Figure 3B). Thus, our 
data suggest that the numbers of single integrations decay 
very slowly over time, while clonally expanded integrations in- 
crease with time on cART. 

The increase in the number of clonal integrations during cART 
did not favor genic or intergenic regions (p = 0.65), indicating that 
this effect is independent of the location of the integration in the 
genome (Figures 3C and 3E and Figure S3C). In contrast, single 
integrations decrease significantly in genic regions (Figure 3D, 
p = 0.036, and Figure S3D) and increase proportionally in inter- 
genic regions (Figure 3F, p = 0.036). Thus, the fate of cells 
harboring single viral integrations in cART treated progressors 
differs from clonal integration. Moreover, the fate of single inte- 
grations is dependent on their location in the genome, whereas 
the clonal integrations are not. These results suggest that cells 
bearing genic single integrations are selected against during 
therapy and that clonal expansion is not. 

Clonally Expanded Integrations in the Same Genes in 
Multiple Patients 

In the three progressors who provided longitudinal samples, 
~5% of the clonal integrations persisted through successive 
time points without selection for genic or intergenic regions 
compared to all clonal integrations (Figures 4A and 4B). Further- 
more, of the genic integrations that persisted, there was also no 
selection for or against those in highly expressed genes (Fig- 
ure 4C). Thus, the persistent clonal integrations are indistinguish- 
able from the larger pool of clonally expanded viral integrations in 
terms of their position in the genome. 

To determine whether specific genes or groups of genes are 
permissive for clonal expansion, we looked for overlap in genic 
integration sites between samples (Figures 4D-4F). Despite a 
higher number of single integrations, there was much greater 
overlap of the genes that harbor clonally expanded integrations 
between individuals irrespective of treatment or level of viremic 
control (p < 0.0001) (Figures 4D-4F). On average, there is 13% 
and 3% overlap between genes harboring clonally expanded 
and single viral integrations, respectively. The genes containing 
clonally expanded viral integrations in multiple patients are ex- 
pressed at lower levels than genes containing overlapping single 
viral integrations (Figure 4G). Taken together, these results sug- 
gest that cells that carry integrations in highly expressed genes 
tolerate clonal expansion less well than cells with integrations 
in genes with lower levels of expression. 

Since clonal integrations have been associated with genes 
involved in malignant transformation (Wagner et al., 2014), we 
examined our entire dataset for enrichment of integrations in 
cancer-associated genes (n = 743 cancer-associated genes 
[Vogelstein et al., 2013; Zhao et al., 2013]). Although there was 
an overall enrichment for integrations in cancer genes (329/ 
4,410 = 7.5%) compared to all genes in the human genome 
(743/25,660 = 2.8%) (p < 0.0001), this preference does not 
seem to be significant because it is similar to the overall prefer- 
ence for integration into highly expressed genes (Figure S4). 
Furthermore, we observed no overrepresentation of single, 
clonal, or persistent integrations in cancer genes (Figure 4H). 



Importantly, a significant decrease in integrations in cancer- 
related genes was observed in longitudinal samples (Figure 41), 
suggesting that these are selected against with therapy. 

Expanded Clones Contain Defective Viruses 

Our method of integration sequencing captures the end of the 3' 
LTR and identifies the genomic site of viral integration. To deter- 
mine whether the viruses found in expanded clones are intact, 
we used nested integration site-specific PCR primers that 
were anchored in the host genome to amplify the 5' LTRs of 75 
expanded clones from eight individuals (Figure 5A and Table 
S2). The clones selected for PCR verification varied in size 
from 5 to 200 out of 0.3-2 x 10®CD4T cells. Of the 75 sequences 
obtained, 24 showed fragmented 5' LTRs flanked by the correct 
genomic site, and an additional 44 of the proviruses did not have 
a recoverable 5' end (Figure 5B). The remaining eight viruses with 
intact 5' LTRs were amplified in limiting dilution conditions using 
integration site-specific primers and FIIV-1 primers (Figure 50). 
Three of the eight viruses could not be amplified; four had large 
deletions in Env, one had a frameshift mutation in po/, and one 
had undergone APOBEC3G-mediated hypermutation to pro- 
duce a premature stop codon in env (Figure 5D and Data SI). 
Thus, we were unable to find a single intact integrated provirus 
among 75 expanded clones. 

Hotspots for Virus Integration 

Overlap between integrations in the genes of different patients 
suggests the existence of hotspots for FIIV-1 integration. A num- 
ber of individual genes have been identified as preferential sites 
for HIV-1 integration, including BACH2, MKL2, DMNT1, MDC1, 
and STAT5B (Ikeda et al., 2007; Maldarelli et al., 2014; Wagner 
et al., 2014). To identify hotspots for HIV-1 integration genome- 
wide, we subjected our dataset to hot^scan analysis (Silva 
et al., 2014), which defines hotspots by identifying regions of 
local enrichment using scan statistics. This analysis identified 
55, 85, and 247 hotspots for controllers, viremic, and treated 
progressors, respectively (Figure 6A). For example, the intron 
between exons 5 and 6 in MKL2 is a hotspot for integration in pa- 
tient 11, contains an expanded clonal family in patient 10, and 
was also identified as a site of enrichment for integration by 
others (Maldarelli et al., 2014) (Figure 6B). 

To validate our in silico analysis and to further characterize the 
MKL2 hotspot, we sequenced the gag gene from proviruses in- 
tegrated into MKL2 by amplification with nested genomic 
primers specific for MKL2 and HIV-1 gag. Sequences obtained 
from patient 10, who showed only one expanded clone, are 
very closely related to each other, which is consistent with a 
single clonally expanded integration (Figure 6C). In contrast, se- 
quences obtained from patient 1 1 are far more diverse, suggest- 
ing that there were several different viral integrations in the MKL2 
hotspot (Figure 6C). We conclude that the hotspots defined by 
hot_scan represent multiple distinct integration events in close 
proximity. 

Viremic progressors had the highest proportion of integration 
events in hotspots, indicating that, in the case of high-level 
viremia, there are specific genomic locations that favor integration 
(Figure 6D). Although the majority of all integrations fall outside of 
hotspots (Figure 6D), hotspots resemble other integrations in that 
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Figure 4. Integrations in Genes Permissive for Clonal Expansion Occur in Multiple Patients 

(A) Percent viral integrations present in more than one time point (persistent integrations) in patients 1 , 2, and 3 (Table SI). 

(B) Comparison of persistent (P) and clonally expanded (CE) viral integrations in genic or intergenic regions. 

(C) Proportion of persistent and clonally expanded viral integrations in genes with high, medium, or low expression, p values refer to proportion of integrations in 
highly expressed genes. 

(D-F) Heatmap showing overlap between samples of genes containing clonally expanded or single viral integrations between samples. Patients are indicated by 
P1-13. Multiple samples from one individual are marked by a bracket. The amount of overlap is denoted by color (see legend); red, 100% overlap. 

(G) Genes containing single or clonally expanded viral integrations were analyzed for their presence in multiple patients. Genes with integrations in more than one 
individual were classified as “overlapping”; genes with integrations in only one individual were classified as "unique.” Shown is the proportion of single and 
clonally expanded unique and overlapping viral integrations in genes with high, medium, or low expression, p values refer to proportion of integrations in highly 
expressed genes. 

(H) Genes with integrations were analyzed for their association with cancer. Proportions of cancer-associated genes are shown for single, clonally expanded, and 
persistent viral integrations. The number indicates the total number of genes from each category. 

(I) Graph shows proportion of integrations in cancer-related genes from patients 1 (blue), 2 (red), and 3 (green) from longitudinal time points (Table SI). Time was 
normalized from 0 to 1 (727 days pre-therapy to 2617 days post-therapy). Dotted line at t = 0.21 marks therapy initiation. Trendline was determined by linear 
regression model and indicates significant change in proportion of events, p = 0.023. 

ns, not significant. *p < 0.05; **p < 0.01 ; ***p < 0.0001 using two-proportion z test. See also Figure S4. 



they are preferentially found within genes with a preponderance of 
these in introns (Figures 6E and 6F). In all cases, hotspots are en- 
riched in highly expressed genes, and consistent with the overall 
decrease in viral integrations in highly expressed genes during 
therapy, the proportion of hotspots in these genes also decreases 
(Figures 6G and 1 E). Thus, the general characteristics of hotspots 
are similar to features of all integrations. 

To determine whether there is a relationship between hotspots 
and clonally expanded viral integrations, we enumerated single 
and clonally expanded integrations in hotspots (Figure 6FI). 



Only a small fraction (11%-18%) of all single integrations were 
found in hotspots with untreated viremic progressors showing 
the highest level (Figure 6H). In contrast, there was a much higher 
proportion of clonal integrations in hotspots (30%-46%), with 
the lowest proportion in treated progressors (Figure 6H). This 
observation is consistent with the finding that there is a greater 
degree of overlap in genes that harbor clonally expanded rather 
than single integrations (Figures 4D-4F) and that clonally 
expanded integrations are more likely to be hotspots than single 
integrations (Figure 6FI, p < 0.0001). 
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Figure 5. Large Expanded Clones Are Defective 

(A) Sequence analysis of 5' LTRs in clonally expanded integrations. Of 75 different cionaily expanded integrations from 8 individuais, 24 showed fragmented 5' 
LTRs, 44 didn’t have a recoverabie 5' LTR, and 8 contained intact 5'LTRs. 

(B) Strategy for HiV-1 sequencing. Eight proviruses were analyzed for intact virai sequence. Nested genomic primers and internai HIV primers were used in a PCR 
walking strategy to ampiify fragments a-e from specific cionaily expanded integrations. PCR products were sequenced directiy. 

(C) Summary of HiV-1 sequencing from large expanded clones. Sequences were aligned to HXB2 and examined forthe presence of large internal deletions. Intact 
sequences were analyzed for G —> A hypermutation by Los Alamos Hypermut algorithm (Rose and Korber, 2000). Non-hypermutated products were analyzed for 
intact reading frames and frameshift mutations by Los Alamos HIVQC. 

Green dot, intact, non-hypermutated sequence; red dot, no PCR product recovered; red triangle, sequence with internal deletion; -, not done. See also Data S1 . 



Viral Integration Enriched in Sites Containing a DNA 
Sequence Motif 

To determine whether there are specific genomic features asso- 
ciated with sites of virai integration and hotspots, we examined 
1 00 base pairs (bp) centered on aii integration sites for the pres- 
ence of a consensus sequence (Baiiey and Eikan, 1994). We 
found 7% of aii integrations within 1 00 bp of a highiy conserved 
30 bp motif (iNT-motif) (Figure 7A). The majority of the integra- 
tions identified in this anaiysis were singie integration events, 
with the ratio of singie-to-clonai integrations being significantiy 
different from the expected (Figure 7B, p < 0.0001). When 
HIV-1 integrates directiy into the iNT-motif, the 5' end of the motif 
is recurrently found 20 bp from the site of virai integration (Fig- 
ure 7C). The INT-motif is asymmetricaily distributed in Alu re- 
peats, and its position coincides with a peak of virai integration 
(Figure 7D). Furthermore, there is a significant overaii enrichment 



of integrations inside Alu repeats (Figure 7E) and in close prox- 
imity to Alu repeats, irrespective of whether the integration is 
inside genes or in intergenic regions (Figure 7F). Thus, a prefer- 
ence for Alu is independent of a preference for integration in 
genes. 

Previous studies have shown a preference for integration into 
Alu repeats, potentiaiiy because Alu repeats are enriched in 
gene-rich regions (Schroder et ai., 2002). To further examine 
the reiationship between Alu repeats and transcription of 
genes, we determined the distance between Alu repeats and 
the center of aii genes. There was no positive correlation be- 
tween the position of Alu and the level of transcription 
(Figure 7G). To determine whether the distance between inte- 
gration and Alu repeats correlates with transcription, we 
measured the distance between the sites of integration and 
Alu repeats in all genes (Figure 7H). There was no significant 
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Figure 6. Identification of Hotspots for HIV-1 Integration 

(A) Number of hotspots identified by hot-scan in viremic controiiers (C) and viremic untreated (V) and treated progressors (T). 

(B) integrations in MKL2 from patients 1 0 and 1 1 . Gray vertical arrows indicate site of integrations. Coiored horizontal lines show fragments of DNA spanning the 
point of integration through sheared end. Green, viruses integrated in the same orientation as gene; red, convergent orientation; orange, viruses integrated with 
both orientations. 

(C) HIV-1 gag was amplified from integrated proviruses in MKL2 from patients 10 and 1 1 . PCR was performed using nested integration site-specific primers and 
HIV-1 gag primers. Sequences were clustered to assess DNA sequence similarity. The scale bar represents 0.007 substitutions per site. 

(D) Proportion of virus integrations inside hotspots. 

(E) Proportion of hotspots in genic and intergenic regions. 

(F) Proportion of hotspots in introns. 

(G) Proportion of hotspots in genes with high, medium, or low expression, p values refer to proportion of integrations in highly expressed genes. 

(H) Percentage of total single and clonally expanded viral integrations inside of hotspots. Enrichment of clonally expanded viral integrations compared to single 
integrations is significant, p < 0.0001 . 

ns, not significant. *p < 0.05; **p < 0.01 ; ***p < 0.0001 using proportion test. 

difference between integration distance to A!u repeats in highly 
expressed, silent, or trace level expressed genes. Therefore, 
the rate of transcription does not impact integration distance 
to A!u repeats, and integration at these sites must be indepen- 
dent of transcription. 

Finally, the number of At/u repeats in a hotspot is directly corre- 
lated with the number of integration events in that hotspot (Fig- 
ure 71, p = 0.86). We conclude that FIIV-1 has a preference for 
integration in close proximity to sites in the genome that are en- 
riched in Alu repeats and that this preference is independent of 
the level of transcription. 



DISCUSSION 

T cells that are actively infected with FIIV-1 are rapidly elim- 
inated during anti-retroviral therapy, but this form of treatment is 
relatively ineffective in selecting against latently infected CD4* 
T cells, which have an estimated half-life of 44 months. Abolish- 
ing the latent reservoir is the current hurdle to finding a cure for 
FIIV-1 infection. Although we have learned a great deal about 
the location of the latent compartment and its persistence during 
therapy, it has been difficult to uncover whether there are spe- 
cific genomic features associated with latency (Siliciano and 
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Figure 7. Consensus Motif for Viral Integration 

(A) 30 bp sequence consensus motif (INT motif). 1 00 bp around all viral integration 
sites were analyzed for a consensus sequence by MEME (Bailey and Elkan, 
1994). 444 integration sites were identified with the INT motif. E value: 6.4 x 
1 The dotted line shows the preferred site of integration (see also C). 

(B) Number of single (S) and clonally expanded (CE) that were identified to 
contain INT motif within 100 bp of the integration site, p < 0.0001, using two- 
proportion z test. 



Greene, 2011). One of the major impediments to understanding 
latency is our inability to purify cells harboring latent HIV-1 as 
opposed to cells containing defective viruses. To further investi- 
gate the latent compartment, we used a high-throughput 
method that uncovers sites of HIV-1 integration while enumer- 
ating clones of expanded T cells that bear identical integrations. 

By comparing HIV-1 integration in controllers and untreated 
and treated progressors, including longitudinal samples ob- 
tained before and after therapy, we found that proliferating 
clones of infected cells accumulate over time. However, we 
were unable to detect intact, full-length viral sequences in these 
clones. Instead, our evidence suggests that the reservoir resides 
primarily in cells bearing unique integrations that are selected 
against by cART in an integration-specific manner, favoring the 
persistence of integrations in intergenic regions and silent genes, 
with decay kinetics that argue against homeostatic proliferation. 

A number of different investigators have shown that HIV-1 pre- 
fers to integrate into the introns of highly expressed genes (Crai- 
gie and Bushman, 2012). This is true for all of the individuals in 
our study irrespective of their status as controllers or treatment 
with cART. Although the level of intrinsic viremic control has no 
detectable effect on integration site selection, therapy selects 
against genic integrations and, more specifically, against inte- 
grations in highly expressed genes, when compared to un- 
treated progressors or controllers. Given that cART selects for 
cells that bear silent proviruses, the results suggest that viruses 
integrated into genes are less likely to become latent than those 
found in intergenic regions. Moreover, the data indicate that, 
among the proviruses integrated into genes, those that are found 
in genes expressed at low levels are also more likely to become 
latent. These findings are entirely consistent with in vitro 



(C) Conserved integration site within INT motif. Histogram maps the start site 
(5' end) of INT motif with respect to the integration site (dotted line). Peak 
shows that the majority of integration sites occur 20 bp from the 5' end of the 
motif start site. Shaded region represents the location of the INT motif relative 
to the majority of the integration sites. 

(D) Location of integration preference and INT-motif inside Alu repeats is 
overlapping. (Left) Location of integration site A/u repeats were plotted relative 
to the midpoint of the repeat. (Right) The location of the start site of INT motifs 
within Alu repeats. 

(E) Integrations are enriched inside Alu repeats. Total integrations identified 
inside Alu repeats were enumerated (red diamond) and compared to the ex- 
pected value as defined by Monte Carlo simulation. The boxplot displays the 
variation of the number of random integrations identified inside Alu repeats by 
each iteration of the simulation. 

(F) Integrations are near Alu repeats in genes and intergenic regions. Average 
distance to the nearest A/u repeat for all integrations inside genes or intergenic 
regions was calculated (red diamond) and compared to the expected distance 
as defined by Monte Carlo simulation. The boxplot displays the variation of the 
distance of random integrations from Alu repeats in genes or intergenic re- 
gions by each iteration of the simulation. 

(G) Distance to Alu repeats from the center of highly, medium, low, trace, or 
silently expressed genes. 

(H) Distance to Alu repeats in highly, medium, low, trace, or silently expressed 
genes. 

(I) Positive correlation between Alu repeats and integrations inside hotspots. 
Graph shows number of Alu repeats (x axis) versus integrations in hotspots 
(y axis). Hotspots not containing Alu repeats were removed from this analysis. 
The scatter plot shows the linear relationship between the number of INT 
motifs and integrations inside hotspots (Pearson’s correlation, p = 0.86). 
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experiments in ceii iines showing that ievei of HiV-1 transcription 
is dependent, in part, on the status of surrounding chromatin 
(Jordan et ai., 2003; Jordan et ai., 2001 ; Sherrili-Mix et ai., 2013). 

HiV-1 integration has been studied in multipie cell types, but 
large libraries of integrations sites in primary infected T cell 
have only recently become available (Maldarelli et ai., 2014; 
Wagner et ai., 2014). Integration sites obtained from in-vitro-in- 
fected cell lines and primary T cells are distinct (Brady et ai., 
2009; Sherrili-Mix et ai., 2013). Nevertheless, common features 
of HIV-1 integration have been defined, including the observation 
that integration favors Alu repeats (Schroder et ai., 2002). This 
association was thought to be dependent on the presence of 
these repeats in the introns of gene-rich regions and not on a 
particular sequence feature (Schroder et ai., 2002). However, 
we observed that integration preference into highly transcribed 
genes and intO/4/u repeats seem to be independently important, 
and furthermore, integrations are enriched near Alu repeats both 
in genic and intergenic regions. One possible explanation for 
preference for Alu seems to be the presence of an I NT motif. 
TG-(N)s. 7 -CA sequence has been associated with sites of 
HIV-1 integration, but an integration consensus has not been 
defined (Brady et ai., 2009; Holman and Coffin, 2005; Lewinski 
et al., 2006; Serrao et ai., 2014; Wang et ai., 2007; Wu et ai., 
2005). We found a 30 bp INT motif within 100 bp of 7% of all in- 
tegrations, the large majority of which are single events. As ex- 
pected, the HIV-1 INT motif contains a signature TG-(N)s.7-CA 
and can form a hairpin structure, anchored on 5' NTG-3', 5' 
CAN-3'. This motif is frequently found at the 3' end of Alu, where 
it coincides with a peak of viral integration events, and viruses in- 
tegrated directly in this motif show a dramatic specificity for 
insertion site. The asymmetric peak and specificity of the inte- 
gration site are remarkable. Nevertheless, we are likely underes- 
timating the frequency of integrations within Alu because we can 
only map unique reads. 

The observation that HIV-1 prefers to integrate in the neighbor- 
hood of Alu repeats is consistent with the finding that different in- 
dividuals have been reported to have multiple integrations in 
selected genes (Ikeda et al., 2007; Maldarelli et al., 2014; 
Schroder et al., 2002; Wagner et al., 2014). Cur experiments 
define a group of overlapping hotspots for integration that share 
many of the features of all HIV-1 integrations, including prefer- 
ence for introns of highly expressed genes and high density of 
Alu repeats. Viremic progressors showed the highest levels of 
hotspot integration, possibly because persistent integration 
leads to over-representation of these favored sites. Alternatively, 
these integrations might be positively selected by mechanisms 
that remain to be determined. 

Individuals receiving cART show increasing numbers of cells 
with identical viral genomes by SGA, suggesting clonal expan- 
sion of a subset of cells bearing integrated proviruses (Buzon 
et al., 2014; Chomont et al., 2009; Wagner et al., 2013). Two in- 
dependent groups have recently documented the long-term 
persistence of expanded clones of cells during therapy with 
cART (Maldarelli et al., 2014; Wagner et al., 2014). Cur analysis 
confirms and extends these observations by showing that, 
when considered as a group, expanded clones are less likely 
to occur when the provirus is in a genic region, and clones that 
are associated with genes tend to be in genes that are expressed 



at lower levels than single integrations. Thus, proviruses inserted 
into active regions of the genome, which would be more likely to 
support viral re-activation during T cell proliferation, are generally 
selected against during clonal expansion. 

Why certain integration sites are permissive for clonal expan- 
sion is not known, but finding that expanded clones with integra- 
tions occur in cancer-related genes led to the suggestion that 
integration into genes that regulate cell division promotes prolif- 
eration (Wagner et al., 2014). While we also found a higher pro- 
portion of integrations in cancer-related genes as compared to 
random, this bias was not different from that observed for other 
highly expressed genes favored by HIV-1 . Further, we do not see 
any differential bias for infegration in cancer-related genes in 
clonally expanded cells compared to single integrations and an 
overall decrease in the number of integrafions in cancer related 
genes during the course of therapy. Since the number and size 
of clones increase with time on therapy, the data indicate that 
integration into cancer genes is unlikely to be a general contrib- 
utor to the proliferation of infected cells. 

Our data show that cART selects for expanded clones and that 
viremic controllers resemble treated progressors in showing a 
higher proportion of expanded clones than untreated viremics. 
cART selects for clonal integrafions irrespective of the location 
in the genome. This is in contrast to single integrations, which 
are selected against by therapy. cART specifically favors the sur- 
vival of single integrations in intergenic regions and is biased 
against genic regions, with an overall half-life for single integra- 
fions of 127 months. The half-life of single integrations is not 
too dissimilar from the current estimate for the latent reservoir, 
which is believed to decay with a half-life of 44 months on 
cART (Finzi et al., 1999). 

The major outstanding question after the discovery of clonally 
expanded cells with integrated HIV-1 is whether the virus from 
these cells contributes to the latent reservoir (Maldarelli et al., 
2014; Wagner et al., 2014). Several different independent lines 
of evidence argue against this idea. First, although the latent 
reservoir is thought to be contained primarily in resting central 
memory CD4''' T cells (Siliciano and Greene, 2011), we find that 
clonally expanded viral integrations are found in all three memory 
T cell compartments. Second, whereas the reservoir appears to 
decay with time on cART, we find that clonally expanded integra- 
tions increase with time and do so irrespective of whether they 
are found in genes or intergenic regions. In contrast, single inte- 
grations in more active parts of the genome, which are more 
likely to support HIV-1 reactivation, are selected against with 
time on ART. Finally, all 75 of the clonally expanded proviruses 
tested were defective, which is in agreement with two examples 
in the literature (Imamichi et al., 2014; Josefsson et al., 2013). 
Thus, we conclude that intact virus is not enriched in infected 
expanded cells. However, we cannot rule out the possibility 
that a rare clone of cells contains an active virus. Nevertheless, 
the 90% of all cells bearing integrated proviruses that account 
for expanded clones of infected cells in cART-treated progres- 
sors appear unlikely to be the major source of the rebounding 
latent reservoir. Instead, the replication-competent reservoir is 
likely to be contained in the remaining 10% of cells that harbor 
single integrations that decline with a long half-life on cART 
(Figure S5). 
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In conclusion, the data indicate that HIV-1 -Infected T cells 
that undergo clonal expansion are able to do so because their 
proviruses are defective and that the replication-competent 
reservoir is found in the subset of 004"^ T cells that remain 
quiescent. 

EXPERIMENTAL PROCEDURES 

004*^ T Cell Isolation for Integration Library Construction 

Human samples were collected after signed informed consent In accordance 
with Institutional Review Board (IRB)-reviewed protocols by all participating in- 
stitutions. Patients 1 , 2, and 3 were selected from the Seattle HIV longitudinal 
cohort studies at Fred Hutchinson Cancer Research Center. Patients 4, 8, and 
9 were recruited from the University of Cologne, and samples were obtained at 
Rockefeller University (MNU_0628). Patients 5, 6, and 7 were selected from the 
Rockefeller University HIV-1 antibody therapy clinical trial. Patients 1 0, 1 1 , 1 2, 
and 1 3 were selected from a group of elite controllers that were followed at the 
Ragon Institute in Boston. 

CD4'^ T cells were isolated from whole PBMC using anti-CD4 microbeads 
(Miltenyi Biotec). The percentage of live cells was determined by flow cytom- 
etry based on forward and side scatter. Purity of CD4"^ T cells was determined 
by labeling isolated cells with anti-human CD3, CD4, CDS, CD1 9, and HLA-DR 
and gating on CDS, CD4 double-positive cells. Isolated cells were used for 
library construction only if purity was >75%. CD4'^T cell subsets were isolated 
by FACSorting on a BD Aria II by labeling cells with anti-human CDS, CD4, 
CDS, CD66b, CD335, HLA-DR, CCR7, CD27, and CD45RA. Analysis of 
CD4"^ T cell subsets was done by pooling cellular DMA isolated from multiple 
sorts of the same sample. 

Quantitative Viral Outgrowth Assay 

Viral outgrowth was performed as previously described. (Laird et al., 2013) 

Integration Library 

The method for integration library construction was adapted from TC-Seq 
(Klein et al., 2011). 

DNA Preparation 

DNA from 0.2-2 million CD4'^ cells from HIV-1 -infected patients was isolated 
and prepared as previously described (Klein et al., 2011). Fragments were 
ligated to 200 pmol of annealed linkers (Table S2). Virus sequences were elim- 
inated by digestion with Bglll (NEB), and fragments were purified. 

integration Site Amplification 

Semi-nested ligation-mediated PCR was performed on DNA. All PCRs were 
performed using Phusion Polymerase (Thermo). DNA was divided into 700 ng 
aliquots and subjected to single-primer PCR with biotinylated LTR1 [ 1 x 
(98C-1 min) 12 x (98C-15 s, 62C-30 s, 72C-30 s) 1 x (72C-5 min)] (Table 
S2). Each reaction was spiked with pLInker and subjected to additional 
cycles of PCR [1 x (98C-1 min) 25 x (98C-15 s, 62C-30 s, 72C-30 s) 1 x 
(72C-5 min)]. Products of 300-1 ,000 bp were isolated by agarose gel electro- 
phoresis and magnetic streptavidin bead purification. Semi-nested PCR was 
performed on the magnetic beads first with a single primer LTR2 (same 
cycling conditions as above) followed by spiking in pLinker and additional 
cycles (Table S2). Products of 300-1,000 bp were isolated by gel 
electrophoresis. 

Paired-End Library Preparation 

Linkers were digested by AscI such that a six-nucleotide barcode (CGCGCC) 
was left on the DNA fragments, indicating linker-dependent amplification. 
Fragments were blunted by End-lt DNA Repair Kit (Epicenter), purified 
with AmPure beads (Agencourt), and ligated to NextFlex paired-end adapters. 
Adaptor-ligated fragments were enriched by 35 cycles of PCR with NextFlex 
primers [1 x (98C-1 min) 35 x (98C-15 s, 66C-30 s, 72C-30 s) 1 x (72C- 
5 min)], and fragments between 300-1 ,000 bp were isolated by gel electropho- 
resis. Two or three libraries were mixed in equimolar ratios and sequenced by 
either 150 bp paired-end sequencing on lllumina MiSeq or 150 bp or 100 bp 
paired-end sequencing on an lllumina 2500 HiSeq. Data are accessible via 
NCBI SRA using the accession number: SRP045822. 



Computational Analysis 
Read Alignment 

Paired-end reads were mapped to the HlV-1 sequence (designated as a bait) 
using BLAT (Kent, 2002) with default settings. Reads that were mapped to the 
bait without mismatches were checked for the linker barcode in the paired-end 
read and were mapped to the human genome reference GRCh37/hg19 with 
Bowtie (Langmead et al., 2009). Only uniquely mapped reads (allowing for 
up to two mismatches) were used as defined in the best alignment stratum 
(command line options: -v2 -all -best -strata -ml). Identical reads generated 
by PCR amplification were merged. 

Integration Determination 

Once the paired-end reads were properly mapped in the bait and human 
genome (see above), we determined the integration breakpoint by aligning 
the remaining nucleotide sequence containing the 3' terminus of the HIV-1 
LTR to the human genome using BLAT (default settings). Only uniquely map- 
ped reads up to 1 Kb away from its partner were kept. Adjacent (within 50 
nucleotides) putative integrations sites were merged. Finally, the 5' end of 
the paired-end reads were used to deduce the integration and shear position 
sites in the human genome. 

Hotspot Detection 

To detect preferred sites of HIV-1 integration genome wide, we subjected our 
dataset to hotjscan software analysis (Silva et al., 2014), which defines hot- 
spots by scan statistics. Hotspots obtained by hot_scan were defined using 
different window widths (100, 200, 500, 1,000, 2,000, 5,000, 10,000, 20,000, 
50,000, and 100,000 bp). 

Motif Anaiysis 

To determine a consensus motif, 100 bp flanking each integration site was 
analyzed for the presence of 30 bp consensus sequence using MEME soft- 
ware (Bailey and Elkan, 1994). 

Monte Carlo Simulation for Virus Integration and Hotspots 

Monte Carlo simulation was conducted by shuffling the genomic locations 
of all virus integration sites 10,000 times using bedtools shuffle utility 
(Quinlan and Hall, 2010). Then, we compared the observed number with 
the median number of integrations in the randomized list. We assessed 
enrichments by p value by counting the frequency of observed events 
being equal to or higher than the number of randomized events divided by 
n = 10,000. 

Statistical Analysis 

Proportion test is the standard test for the difference between proportions, 
also known as a two-proportion z test. We used R’s implementation of this 
via the prop.testQ function. 

Integration Library Verification 

To verify our integration sequencing strategy, we constructed two libraries 
from DNA isolated from un-infected individuals. We recovered 13 sequences 
that mapped to integration sites. We subtracted these “integration sites” 
from all libraries before proceeding. 

To test the saturation of our method, two separate integration libraries were 
constructed from identical samples for three patients. We found that both li- 
braries contained the same expanded clonal families, but the majority of single 
virus integrations were unique to each sample of cells used for library con- 
struction. Single viral integrations found in both libraries were less than 1 % 
of observed viral integrations. 

PCR Verification 

Genomic DNA isolated as described was serially diluted and subjected to 
nested-PCR using genomic specific primers and primers LTR1 and LTR2 
(Table S2) using HotStart Taq Polymerase (QIAGEN) [1 x (98C-14 min) 40 x 
(98C-30 s, 55C-30 s, 72C-30 s) 1 x (72C-5 min)]. Products were isolated by 
gel electrophoresis and sequenced directly. Analysis of clones in this manner 
identified that we underestimate the size of clones by four to five times (data 
not shown). 

CD4* T Cell Subset Sorting 

To isolate CD4'^ subsets, we labeled PBMCs with antibodies to CD45RA, CD4, 
CD8 CD66b, CCR7, CD335, HLA-DR, CD3, and CD27. We separated T cell 
subsets by FACS Aria (BD Biosciences) to very high purity (>98%). 
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Virus Sequencing 
5' LTR 

5' LTRs from large clones were amplified with nested genomic primers and 
LTR2Rev (Table S2) using Platinum High Fidelity Taq (Invitrogen) [1 x (98C- 
14 min) 40 x (98C-30 s, 55C-30 s, 68C-1 min) 1 x (68C-5 min)]. Products 
were isolated by gel electrophoresis and sequenced directly. 

Full-Length Virus 

Full-length genomic DMA from infected patients was isolated as described and 
serially diluted. Each well was filled to a final volume of 50 )il with PCR reaction 
mixture (Platinum Taq MasterMix, Invitrogen) and primers to amplify virus from 
a specific integration site in the genome (Table S2 and Ho et al., 2013) using 
touchdown cycling to increase specificity. Then, 2|.il aliquots from the first 
PCR were subjected to nested genomic PCR and 1% gel electrophoresis. 
The positive wells were gel purified, and fragments were sequenced directly. 

ACCESSION NUMBERS 

Data are accessible via NCBl SRA using the accession number SRP045822. 
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SUMMARY 

Antibodies developed during HIV-1 infection lose 
efficacy as the viral spike mutates. We postulated 
that anti-HIV-1 antibodies primarily bind monova- 
lently because HIV’s low spike density impedes 
bivalent binding through inter-spike crosslinking, 
and the spike structure prohibits bivalent binding 
through intra-spike crosslinking. Monovalent binding 
reduces avidity and potency, thus expanding the 
range of mutations permitting antibody evasion. To 
test this idea, we engineered antibody-based mole- 
cules capable of bivalent binding through intra-spike 
crosslinking. We used DMA as a “molecular ruler” 
to measure intra-epitope distances on virion-bound 
spikes and construct intra-spike crosslinking mole- 
cules. Optimal bivalent reagents exhibited up to 2.5 
orders of magnitude increased potency (>1 00-fold 
average increases across virus panels) and identified 
conformational states of virion-bound spikes. The 
demonstration that intra-spike crosslinking lowers 
the concentration of antibodies required for neutrali- 
zation supports the hypothesis that low spike den- 
sities facilitate antibody evasion and the use of mol- 
ecules capable of intra-spike crosslinking for therapy 
or passive protection. 

INTRODUCTION 

The HIV-1 envelope (Env) spike trimer, a trimer of gp120 and 
gp41 subunits, is the oniy target of neutralizing antibodies. 
The spike utiiizes antibody-evasion strategies, inciuding muta- 
tion, giycan shieiding, and conformationai masking (West et ai., 
2014). Aithough important, these features are not unique to 
HIV-1; other viruses empioying these strategies eiicit IgG anti- 
body responses that provide steriiizing immunity or virai clear- 
ance. A potentially unique antibody-evasion strategy for HIV-1 
involves hindering IgGs from using both antigen-binding frag- 
ments (Fabs) to bind bivalently to spikes (Klein and Bjorkman, 
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2010; Mouquet et al., 2010). This is accomplished by the small 
number and low density of Env spikes (Chertova et al., 2002; 
Liu et al., 2008; Zhu et al., 2006), which prevent most IgGs 
from inter-spike crosslinking (bivalent binding between spikes), 
and the architecture of the Env trimer, which impedes intra-spike 
crosslinking (bivalent binding within a spike trimer) (Klein et al., 
2009; Luftig et al., 2006). 

On a typical virus with closely spaced envelope spikes, an IgG 
antibody can bind using both Fabs to crosslink neighboring 
spikes, leading to a nearly irreversible antibody-antigen interac- 
tion (Mattes, 2005). Avidity effects from bivalent binding of IgG 
antibodies have been shown to be critical for neutralization of 
many viruses, including polio and influenza (Icenogle et al., 
1983; Schofield et al., 1997). By contrast, the small number of 
spikes (^^14) present on the surface of HIV-1 (Chertova et al., 
2002; Liu et al., 2008; Zhu et al., 2006) impedes simultaneous 
engagement of both antibody combining sites (Klein and Bjork- 
man, 2010; Mouquet et al., 2010)— most spikes are separated 
by distances that far exceed the ~15 nm reach of the two Fab 
arms of an IgG (Liu et al., 2008; Zhu et al., 2006) (Figure 1A). In- 
ter-spike crosslinking might still be possible if spikes could freely 
diffuse within the viral membrane. However, cryo-electron to- 
mography of HIV-1 (Zhu et al., 2006) and evidence for interac- 
tions between the cytoplasmic tail of gp41 and the matrix protein 
of HIV (Bhatia et al., 2009; Crooks et al., 2008; Yu et al., 1992) 
suggest that a virion’s spike distribution is likely to be relatively 
static over timescales relevant to neutralization. Taken together, 
the mechanisms to hinder inter- and intra-spike crosslinking 
imply that most anti-HIV-1 IgGs bind monovalently to virions. 

It seems an unlikely coincidence that HIV-1 , among the most 
adept of viruses at evading antibody-mediated neutralization, 
has an unusually low density of surface envelope spikes with 
restricted mobility, as well as an unusually high mutation rate. 
We speculated that HIV-1 evolved a low spike density to hinder 
bivalent binding by antibodies (Klein and Bjorkman, 2010) and 
postulated that the combination of predominantly monovalent 
IgG binding and HIV-1 ’s rapid mutation rate creates an additional 
effective antibody evasion strategy (Klein and Bjorkman, 201 0). If 
the affinity between an IgG Fab and a viral spike is high enough, 
monovalent IgG binding to a virion should not, in and of itself, 
hinder or prevent viral neutralization. Thus, affinity-matured 
anti-Env IgGs raised against a particular strain of virus can 

Cell 160 , 433-446, January 29, 2015 ©2015 Elsevier Inc. 433 





Cell 




Figure 1. IgG and diFab Reagents Binding 
to Viral Spikes 

(A) Top: IgG binding monovalently to spikes on 
HIV-1 surfaces, which include a small number 
(~1 4) and low density of Env {Chertova et al., 2002; 
Liu et al., 2008; Zhu et al., 2006). Bottom: homo- 
diFab reagent binding bivalently to HIV-1 Env by 
intra-spike crosslinking. Schematic representa- 
tions of Env adapted from figures in Liu et al. 
(2008). 

(B) Schematic of method used to produce homo- 
and hetero-diFabs. 

See also Figure SI . 
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a high density of Env spikes (Liljeroos 
et al., 2013). Palivizumab Fabs with fast 
off-rates/low affinities exhibited 2-3 log 
improvements in neutralization potencies 
when converted to bivalent IgGs, and the 
potencies of the IgGs were not affected 
by mutations that increased the off-rates 
of their corresponding monovalent Fabs 
by >1 00-fold (Wu et al., 2005), illustrating 
the importance of avidity for IgGs with 
weak or moderate affinity Fabs. However, 
high affinity/slow off-rate palivizumab 
Fabs were equally as potent as their IgG 
counterparts, which could bind bivalently 
to RSV through inter-spike crosslinking. 
In the palivizumab example, binding and 
neutralization potencies were evaluated 
for a single strain of virus and antibodies. 
In the case of HIV-1 , we are interested in 
the effects of mutations in the virus on 
binding of the same antibody, but the 
effects of mutation are expected to be 
similar. Thus, we postulate that avidity ef- 
fects through bivalent binding can serve 
as a buffer to dampen the effects of viral 
mutations on neutralization potencies of 
IgGs. 

This line of reasoning suggests that 
bivalent HIV-1 binders would be optimal 
for passive prevention or immuno- 
therapy, but because inter-spike dis- 
tances are not constant, even on a single 
virion, it is not possible to engineer re- 
agents that could consistently accom- 
plish inter-spike crosslinking. In contrast, 
reagents that can bind bivalently to a sin- 
gle trimeric spike would function independently of both spike 
density and distribution (Pace et al., 2013). To test the idea 
that intra-spike crosslinking results in increased neutralization 
potency, we used molecular rulers to map epitopes on virion- 
bound HIV-1 spikes and created molecules designed to syner- 
gize through bivalent interactions within single Env trimers 
(Figure 1A). We developed methods to produce multiple combi- 
nations of Env binders separated by different distances by 
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lll!l!!ll!lll!llll! 

dsDNA 



ssDNA-conjugated Fab 



effectively neutralize autologous virus (Klein et al., 2013; West 
et al., 2014). However, upon mutation of an antibody epitope 
on Env, the low affinity of the monovalent Fab-antigen interaction 
would result in either complete loss of neutralization or neutrali- 
zation only at very high concentrations. These concepts are illus- 
trated by comparisons of binding and neutralization for variants 
of IgG and Fab forms of palivizumab, a neutralizing IgG against 
respiratory syncytial virus (RSV) (Wu et al., 2005), a virus with 
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attaching broadly neutralizing antibody (bNAb) Fabs to variabie- 
iength doubie-stranded DNA (dsDNA) (Figures 1B and S1). We 
chose dsDNA as a iinker because its long persistence iength 
(460-500 A [Bednar et ai., 1995] compared with ^30 A for pep- 
tides [Zhou, 2004]) permits its use as a molecuiar ruler with 
3.4 A/base pair (bp) increments. Flere, we show that homo- 
and hetero-diFabs joined by optimai-iength dsDNA bridges 
can achieve neutraiization potency increases of two to three or- 
ders of magnitude and provide evidence that the synergy resuits 
from intra-spike crossiinking. Upon determining the optimai 
distances between Env trimer-bound Fabs, we show that it is 
possibie to convert the dsDNA bridge to a protein iinker to create 
a protein-based reagent with simiiar synergistic properties. 
These resuits iiiustrate the importance of avidity in antibody- 
pathogen interactions, elucidate mechanisms by which FIIV-1 
evades the host immune system, and are relevant to the choice 
of potentiai protein therapeutics to be delivered to prevent or 
treat FIIV-1 infections. 

RESULTS 

Homo-diFabs Exhibit Length-Dependent Avidity Effects 
Consistent with Intra-Spike Crosslinking 

Fabs were modified to contain a free thioi and then conjugated to 
maieimide-activated singie-stranded DNA (ssDNA) (Figure 1 B). 
Different iengths of dsDNA (designed to iack secondary struc- 
tures [Zadeh et al., 2011]) (Extended Experimental Procedures 
and Table S6) were annealed with and iigated to the ssDNA- 
Fab conjugates to create homo- or hetero-diFabs, in which the 
two Fabs were the same or different, respectiveiy. Dynamic iight 
scattering confirmed that conjugates with longer DNA bridges 
were more extended (Figure 2A), supporting the use of dsDNA 
as a ruler. Inter-Fab distances caicuiated from dsDNA iengths 
were regarded as approximate because the DNA iinkers 
inciuded short regions of ssDNA (persistence iength 22 A) (Chi 
et ai., 2013) to permit orientationai fiexibility. 

We first determined the optimal dsDNA linker for a homo-di- 
Fab constructed from 3BNC60, a bNAb against the CD4 binding 
site (CD4bs) on the gpl 20 subunit of Env (Scheid et al., 201 1 ), by 
evaluating homo-diFabs with different dsDNA lengths using 
in vitro neutralization assays. The 50% inhibitory concentrations 
(ICsos) against FIIV-1 strain 6535.3 depended on the dsDNA 
length, with the most potent homo-diFab containing a bridge 
of 62 bp (211 A) (Figures 2B and S2). This length is close to 
the predicted distance (^198 A) between the C termini of 
adjacent 3BNC60 Fabs bound to the open structure of an FIIV- 
1 trimer (Merk and Subramaniam, 2013) (Figures 3 and S3). 
Bridge lengths of ~60 bp also exhibited the best potencies for 
3BNC60 homo-diFabs against DU172.17 FIIV-1 and for homo- 
diFabs constructed from VRC01 (Wu et al., 2010), a related 
CD4bs bNAb (Figure S2). The ~ 100-fold increased potency of 
3BNC60-62bp-3BNC60 compared with 3BNC60 IgG against 
FIIV-1 6535.3 (Figure 2B) suggested synergy resulting from avid- 
ity effects due to bivalent binding. The bivalent interaction likely 
resulted from intra-spike crosslinking rather than inter-spike 
crosslinking because the latter should not manifest with a sharp 
length dependence because inter-spike distances are variable 
within and between virions (Liu et al., 2008; Zhu et al., 2006). 



To formally assess the extent to which inter-spike crosslinking 
could contribute to synergy, we evaluated homo-diFabs con- 
structed from the VI V2 loop-specific bNAb PG16 (Walker 
et al., 2009), which cannot crosslink within a single spike 
because only one anti-Vi V2 Fab binds per Env trimer (Julien 
et al., 2013b). PG16 homo-diFabs with different dsDNA bridges 
did not exhibit length-dependent neutralization profiles against 
strain 6535.3 (Figure 2B) and other viral strains (Figure S2D). 
Flowever, increased potencies were observed for PG16 homo- 
diFabs with > 70 bp or 80 bp (>248 A or 272 A) bridges, perhaps 
reflecting increased inter-spike crosslinking with longer separa- 
tion distances (Figures 2B and S2D). 

Comparison of Homo-diFabs that Can or Cannot Exhibit 
Intra-Spike Crosslinking 

To evaluate the potential for intra-spike crosslinking across 
different viral strains, we compared homo-diFabs designed to 
be capable (b1 2 and 3BNC60) or incapable (PG1 6) of intra-spike 
crosslinking (Figure 2C). To minimize inter-spike crosslinking, the 
homo-diFabs were constructed with 60-62 bp bridges. The b1 2- 
60bp-b12 homo-diFab exhibited increased potency compared 
with b1 2 IgG in 21 of 25 strains in a cross-clade panel of primary 
FIIV-1, with potency increases > 10-fold for 16 strains and a 
geometric mean potency increase of 22-fold. 3BNC60-62bp- 
3BNC60 showed even more consistent synergy, being more 
potent than 3BNC60 IgG against all 25 strains tested, with 
> 10-fold increases for 20 strains and a mean increase of 19- 
fold. By contrast, the PG16-60bp-PG16 homo-diFab showed 
potency increases compared with PG16 IgG against only six 
strains, with relatively small (2- to 7-fold) increases in five strains 
and an overall 2.8-fold mean potency change. 

Hetero-diFabs Exhibit Dramatic Potency Increases 
Consistent with Intra-Spike Crosslinking 

To determine whether heterotypic bivalent binding can produce 
synergy and to measure distances between epitopes, we used 
dsDNA to link Fabs recognizing different epitopes on gpl 20. 
We first evaluated hetero-diFabs constructed with Fabs from 
VI V2 (PG16 or PG9) (Walker et al., 2009) and CD4bs (b12 or 
3BNC60) (Roben et al., 1994; Scheid et al., 201 1) bNAbs linked 
with 60 bp dsDNA bridges. PG16-60bp-b12 hetero-diFabs 
were evaluated in neutralization assays against FIIV-1 strains 
SC4226618 (more sensitive to b12 than PG16) and CAP210 
(more sensitive to PG1 6 than b12). According to the model being 
tested, in the absence of synergistic binding; i.e., when only one 
Fab can bind to a spike at a time, a hetero-diFab would be no 
more potent than a non-covalent mixture of the dsDNA and the 
two Fabs against each viral strain, whereas synergistic binding 
would result in avidity effects exhibited by increased potency 
of the hetero-diFab. For both viral strains, the PG16-60bp-b12 
hetero-diFab was ~1 0-fold more potent than the mixture of 
Fabs plus dsDNA or the more potent of the two Fabs alone (Fig- 
ures 4 and S4). To more systematically explore potential syn- 
ergy, we evaluated PG16-60bp-b12 against a 25 member panel 
of FIIV-1 strains, finding synergistic effects (between 2- and 145- 
fold more potent than the corresponding non-covalent mixture 
for most strains; geometric mean improvement of 4.7-fold) 
(Table SI). When Fabs from PG16 or PG9 were combined with 
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ic,. (nM) 


Virus 


Clade 


Tier 


dsDNA 


b12-60bp-b12 


b12 

Fab 


b12 

IgG 


3BNC60-62bp- 

3BNC60 


3BNC60 

Fab 


3BNC60 

IgG 


PG16-60bp- 

PG16 


PG16 

Fab 


PG16 

IgG 


6535.3 


B 


1B 


>1000 


2.7 (22x) 


360 


60 


0.24 (17x) 


27 


4.0 


7.8 (2x) 


>200 


17 


SC422661.8 


B 


2 


>1000 


0.55 (5x) 


9.5 


3.0 


0.02 (15x) 


1.0 


0.3 


1.3 (4x) 


65.0 


4.7 


PV0.4 


B 


3 


>1000 


>70 (lx) 


4700 


700 


0.01 (lOx) 


1.3 


0.1 


10 (2x) 


150 


16 


TR0.11 


B 


2 


>1000 


8.9l>79x) 


7600 


700 


0.01 (20x) 


1.3 


0.2 


7.8 (lx) 


>200 


6.5 


TRJ04551.58 


B 


3 


>1000 


>70 (lx) 


7900 


700 


0.03 (<3x) 


4.8 


<0.1 


0.70 (3x) 


64 


2.1 


CAAN5342.A2 


B 


2 


>1000 


35(20x) 


6800 


700 


0.16 (22x) 


6.2 


3.6 


8.7 (lx) 


65 


6.0 


THR04156.18 


B 


2 


>1000 


2.7 (4x) 


100 


10 


0.43 (8x) 


23 


3.5 


4.9 (0.5x) 


>200 


2.7 


RHPA4259.7 


B 


2 


>1000 


0.48 (lx) 


6.0 


0.73 


0.007143x1 


1.5 


0.3 


0.07 (lx) 


1.5 


<0.1 


Du156.12 


C 


2 


>1000 


0.21 (23x) 


150 


4.8 


0.03 (lOx) 


9.4 


0.3 


<0.04 (lx) 


0.8 


<0.1 


Du172.17 


C 


2 


>1000 


0.35 (28x) 


100 


10 


1.0 (16x) 


210 


16 


<0.04 (lx) 


2.6 


<0.1 


DU422.1 


C 


2 


>1000 


0.03j67xl 


60 


2.0 


42 (>8x) 


>200 


>330 


<0.04 (>7x) 


1.0 


0.3 


ZM197M.PB7 


C 


IB 


>1000 


4.9 (^6x) 


1000 


620 


0.11 Ofel 


15 


3.8 


0.42 (6x) 


5.5 


2.4 


ZM214M.PL15 


C 


IB 


>1000 


3.8 (11x) 


380 


43 


0.07 (24x) 


6.9 


1.7 


>70 (1x) 


>200 


>330 


ZM233M.PB6 


C 


2 


>1000 


16l44x) 


4500 


700 


0.08 (17x) 


14 


1.4 


<0.04 (lx) 


<0.1 


<0.1 


ZM249M.PL1 


C 


2 


>1000 


1.1 (21x) 


240 


23 


0.01 130x1 


2.7 


0.3 


<0.04 (lx) 


0.4 


<0.1 


ZM53M.PB12 


C 


2 


>1000 


3.4 (56x1 


890 


190 


0.11 (12x) 


5.2 


1.3 


<0.04 (1x) 


1.0 


<0.1 


ZM109F.PB4 


C 


IB 


>1000 


67 (10x) 


3400 


700 


0.05 (6x) 


39 


0.3 


0.14 (214x) 


6.7 


30 


ZM135M.PL10a 


C 


2 


>1000 


46 (15x) 


5900 


700 


0.03 (13x) 


3.7 


0.4 


>70 (1x) 


>200 


>330 


CAP45.2.00.G3 


C 


2 


>1000 


0.35 (lOx) 


40 


3.4 


0 3(167x) 


>200 


50 


<0.04 (1x) 


<0.1 


<0.1 


CAP210.2.00.E8 


C 


2 


>1000 


5.7 (9x) 


70 


50 


1.5139x1 


>200 


59 


0.14 (1x) 


11.0 


<0.1 


Q842.d12 


A 


2 
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0.001 190x1 


0.21 


0.09 


<0.04 (lx) 


<0.1 


<0.1 


Q259.d2.17 


A 


2 


>1000 


48(15x) 


7500 


700 


0.007 (4x) 


3.7 


0.03 


<0.04 (lx) 


0.61 


<0.1 


3718.v3.c11 


A 


2 


>1000 


<0.04 (>3100x) 


690 


126 


12.6 (>26x) 


>200 


>330 


<0.04 (1x) 


<0.1 


<0.1 


0330.v4,c3 


A 


2 


>1000 


60 (12x) 


5600 


700 


0.007 (29x) 


0.62 


0.2 


<0.04 (lx) 


<0.1 


<0.1 


3415.v1.c1 


A 


2 


>1000 


2.3 (23x) 


200 


53 


0.04 (22x) 


4.6 


0.9 


0.07 (lx) 


1.22 


<0.1 
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3.6 


0.08 


0.42 
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18 


2.3 


2.7 


Mean Fold Improvement Over IgG (Geometric) 


22 


19 


2.8 


Mean Fold Improvement Over IgG (Arithmetic) 


18 


14 


2.1 
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Figure 2. Characterization of Homo-diFabs 

(A) Dynamic light-scattering measurements of hydrodynamic radii for IgG and Fab proteins, different lengths of dsDNA alone, and di-Fabs with different dsDNA 
linkers. 

(B) Effects of dsDNA bridge length on neutralization potencies of 3BNC60 and PG1 6 homo-diFabs against the Tier 1 B HIV-1 strain 6535.3. Neutralization ICsqS are 
plotted against the length of the dsDNA linker. ICsoS for the parent IgG and Fab are indicated as red and blue lines, respectively. 

(legend continued on next page) 
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Figure 3. Comparison of Intra-Spike Dis- 
tances for Three Conformations Found for 
Virion-Associated HIV-1 Env Spike Trimers 

(A) Three conformations of Env trimers shown 
as surface representations (top row: gp 120 co- 
ordinates only) and schematically (bottom two 
rows). Schematic representations of Env trimers 
adapted from figures in Liu et al. (2008). Env spikes 
are shown as seen from above (top and middle 
rows) and the side (bottom row). VI V2 loops are 
cyan, V3 loops are purple, the CD4 binding site is 
yellow, the remainder of gpl 20 is maroon, gp41 is 
green, and the membrane bilayer is gray. The 
closed structure (PDB code 4NCO) was observed 
for unliganded trimers (Liu et al., 2008) and trimers 
associated with Fabs from potent VRCOI-like 
(PVL) antibodies (Lyumkis et al., 2013; Merk and 
Subramaniam, 2013). The open structure was 
observed for trimers associated with CD4 or the 
Fab from the CD4-induced antibody 17b (Merk 
and Subramaniam, 2013; Tran et al., 2012) (co- 
ordinates obtained from S. Subramaniam). The 
partially open structure was observed for trimers 
associated with the Fab from b12 (Liu et al., 2008; 
Merk and Subramaniam, 2013) (PDB code 3DNL). 

(B) Measured distances between homo-diFabs 
bound to HIV-1 trimer structures. Fabs from the 
indicated bNAbs are shown bound to the gpl 20 
portions of Env in the three conformation shown in 
(A). Fabs are shown as ribbons; gpl 20 subunits 
are shown as surface representations with VI V2 
loops in cyan, V3 in purple, the CD4 binding site in 
yellow, and the remainder of gpl 20 in maroon. The 
distance between the Cys233heavy chain carbon-a 
atoms of adjacent bound Fabs is indicated by a 
gray line as an approximation of an optimal length 
for a dsDNA bridge attached to Cys233heavy chain- 
Assuming 3-fold symmetry of trimers, only one 
distance is possible for bound 3BNC60, b12, and 
10-1074 homo-diFabs. 

See also Figure S3. 




a more potent CD4bs-recognizing bNAb (3BNC60), the result- 
ing hetero-dlFabs exhibited greater synergy— several examples 
of >1 50-fold improvement for PG16-60bp-3BNC60 and PG9- 
60bp-3BNC60 and geometric mean potency improvements of 
29- and 68-fold, respectively (Figure 4 and Tables S2 and S3). 
Other hetero-diFabs, constructed with combinations of Fabs 
recognizing the CD4bs (3BNC60 [Scheid et al., 2011]), the 



gp120 V3 loop (10-1074 [Mouquet et al., 
2012]), and a gp41 epitope (10E8 [Fluang 
et al., 2012]), also showed synergistic 
effects (Figure 4 and Table S4), and 
a 3BNC60-60bp-b12 hetero-diFab ex- 
hibited up to 660-fold synergy and a geo- 
metric mean potency increase of 90-fold 
(Figure 4 and Table S5). In contrast, analogous IgG hetero- 
dimers, constructed with two different Fabs linked to a single 
Fc (Schaefer et al., 2011), did not show synergy when evaluated 
against the same viruses, demonstrating that synergistic effects 
required optimal separation distances that permitted each Fab 
to achieve its specific binding orientation (Figure S4 and Tables 
SI, S2, S3, S4, and S5). We conclude that hetero-diFabs can 



(C) Neutralization of primary HIV-1 strains by b1 2 and PG1 6 homo-diFabs, each constructed with a 60 bp dsDNA bridge. ICsoS are reported for the homo-diFabs, 
the parental Fabs and IgGs, and dsDNA alone. As a measure of potential synergy, the molar ratio of the IC 50 values for the IgG and the homo-diFab is listed for 
each strain in parentheses beside the IC 50 for the homo-diFab. 

See also Figure S2. 
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IC 50 (nM) 


Virus 


Clade 


Tier 


PG16-40bp- 

3BNC60 


PG16-50bp- 

3BNC60 


PG16-60bp- 

3BNC60 


PG9-60bp- 

3BNC60 


3BNC60-60bp- 

b 12 


6535.3 


B 


1B 


0 31 (351x) 


0 23 (474x) 


0 66 (165x) 
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Figure 4. Synergistic dsDNA-Based Hetero-diFabs 

Neutralization of primary HIV-1 strains by hetero-diFabs. ICsqS are reported for the hetero-diFabs. See Tables SI, S2, S3, S4, and S5 for ICsqS of parental Fabs 
and IgGs, dsDNA alone, and the non-covalent mixtures of Fabs and dsDNA. As a measure of potential synergy of each hetero-diFab, the molar ratio of the IC 50 
values for the non-covalent mixture and the hetero-diFab is listed for each strain in parentheses beside the IC 50 for the hetero-diFab. NT, not tested. See also 
Figure S4. 
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Figure 5. Synergistic Protein-Based Het- 
ero-diFab 

(A) Schematic representation of PG16-TPR12- 
3BNC60 (not to scaie). Approximate iengths are 
indicated (1 20 A for the TRP1 2 iinker pius —1 1 A 
for the fused ciick handles). 

(B) Neutralization of primary HiV-1 strains. iCsoS 
are reported for PG16-TPR12-3BNC60, the 
parentai components of the reagent (PG16 Fab 
and 3BNC60 Fab-TPR12), and TPR12 aione. As a 
measure of potentiai synergy of PG16-TPR12- 
3BNC60, the moiar ratio of the iCso vaiues for the 
most potent component and PG16-TPR12- 
3BNC60 is iisted for each strain in parentheses 
beside the iCso for PG16-TPR12-3BNC60. 

See aiso Figure S5. 
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achieve synergy through simultaneous recognition of two 
different epitopes on the same HIV-1 Env trimer. 

To more precisely define optimal intra-epitope separation 
distances, we evaluated hetero-diFabs with different bridge 
lengths, finding length-dependent synergy effects. For example, 
PG16-3BNC60 hetero-diFabs with 40 bp and 50 bp dsDNA 
bridges showed improved neutralization potencies when 
compared to the 60 bp (204 A) version, achieving > 1 00-fold po- 
tency increases against over half of the tested strains and geo- 
metric mean improvements of 98- and 107-fold, respectively 
(Figure 4 and Table S4). The 40 bp and 50 bp bridges (136 A 
and 170 A, respectively) corresponded to the approximate sep- 
aration distances between PG1 6 and 3BNC60 Fabs when bound 
to the same gp120 within a trimer (147 A) or to neighboring 
protomers within open or partially open trimers (167 A) (Fig- 
ure S3). In a second length dependency example, 10-1074- 
40bp-3BNC60 was more potent than 10-1074-60bp-3BNC60 



(Figure 4 and Table S4). The ^^136 A 
distance between the two Fabs in 10- 
1074-40bp-3BNC60 corresponded to 
the approximate separation between 
these Fabs bound to the same gp120 
(141 A), whereas 60 bp more closely 
approximated Fabs bound to neigh- 
boring protomers on an open trimer 
(193 A) (Figure S3). The 40 bp and 50 bp 
versions of 10E8-3BNC60 showed 
consistent synergy (Figure 4 and Table 
S4); however, the lack of structural infor- 
mation concerning 10E8 binding to Env 
trimer hindered interpretation of 10E8- 
containing hetero-diFabs. 



A Hetero-diFab Constructed with a 
Protein Linker Exhibits Synergistic 
Potency Increases 

Bivalent molecules involving dsDNA 
linkers were effective for demonstrating 
synergistic neutralization, but a protein 
reagent would be preferable as an anti- 
HIV-1 therapeutic. We recently described 
a series of protein linkers of various lengths and rigidities (Klein 
et al., 2014) that can mimic the properties of different lengths 
of dsDNA. Thus, we can substitute a comparable protein linker 
for an optimal dsDNA bridge to create a protein reagent capable 
of simultaneous binding to two different epitopes on a single HIV- 
1 spike trimer. As a proof-of-principle example, we used sortase- 
catalyzed protein ligation and click chemistry (Witte et al., 2013) 
to construct a bivalent reagent analogous to PG16-40bp- 
3BNC60 by substituting the dsDNA linker with 12 domains of a 
designed tetratricopeptide-repeat (TPR) protein (Kajander 
et al., 2007) (Figures 5A and S5). We chose a TPR linker because 
tandem repeats of TPR domains form a rigid rod-like structure 
whose length corresponds predictably with the number of re- 
peats, with each domain contributing ~10 A (Kajander et al., 
2007). PG16 Fab was expressed with a C-terminal sortase 
signal, and the C terminus of the 3BNC60 Fab was modified to 
include 12 TPR repeats and a sortase signal. The tagged Fabs 
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Figure 6. Simulations of Avidity Effects due to Bivalent Binding of IgG to a Tethered Antigen 

(A) The fraction of tethered antigen bound by different concentrations of IgG or Fab after 1 hr shown as a heat map (cooier coiors representing a lower percentage 
bound and warmer colors representing a higher percentage bound) as a function of kinetic constants for the IgG-antigen or Fab-antigen interaction. The fraction 
of antigen bound by a Fab or IgG was calculated as a function of and k^. The intrinsic affinities are strongest in the lower right corner (1 pM) and weakest in the 
upper ieft corner (100 mM) of each graph. For IgG, binding was forced to 100% monovalent binding (middle row) or 100% bivalent binding (bottom row). 
Saturation by Fabs and IgGs was nearly identical for monovalent binding conditions because the binding kinetics of IgGs would be enhanced by at most 2-fold. 
Comparisons of the simulations for bivalent binding (bottom row) and monovalent binding (top two panels) showed regions of saturation binding resulting from 
avidity effects. 

(B) The fraction of antigen bound as a function of time for IgGs binding to surface-tethered antigens at an input concentration of 1 0 nM. When the dissociation rate 
constant of the Fab portion of the IgG is slow (top) and the input concentration is approximately 100-fold higher than the affinity of the Fab, IgGs can reach 
saturation binding after an hour whether binding monovalently or bivalently to the surface— hence, avidity effects are not apparent after an hour. However, 
weakening the affinity of the Fab by making the dissociation rate 1 ,000-fold faster (bottom) prevents saturation when binding monovalently but has no effect on 
saturation when binding bivalently— hence, avidity effects are apparent throughout the incubation. 



were covalently attached to peptides containing click handles 
using sortase-catalyzed ligation and then incubated to allow 
the click reaction to form PG16 Fab linked to 3BNC60 Fab 
by 12 TPR repeats (PG16-TPR12-3BNC60). Together with the 
remnants of the click handles, the linker would occupy ~131 A, 
approximately the same length as the dsDNA linker in PG16- 
40bp-3BNC60 reagent (Figures 5A and S5). The protein-based 
molecule, PG16-TPR12-3BNC60, exhibited between 11- and 
>200-fold synergy against 12 primary HIV-1 strains (Figure 5B; 
33-fold geometric mean increased potency). 

Simulations of the Effects of Avidity on IgG Binding to 
Tethered Antigens 

To better understand the effects of avidity arising from bivalent 
binding of IgGs to antigens tethered to a surface such as a viral 
membrane, we used modeling software to simulate the satura- 
tion of surface-bound antigens by monovalent Fabs and biva- 
lent IgGs. We chose a 1 hr incubation time based upon condi- 
tions under which in vitro neutralization assays are conducted 
(Montefiori, 2005). We varied the density of the tethered anti- 



gens and the concentrations of Fab or IgG and investigated a 
range of intrinsic association and dissociation rate constants 
for the binding interaction. The fraction of antigen bound by a 
Fab or IgG was calculated as a function of on- and off-rates 
(ka and k^), whose ratio {k^/k^ is equal to the affinity {Kq, or 
equilibrium dissociation constant). We compared saturation 
by Fabs (top row), IgGs in which only monovalent binding 
was permitted (center row), and IgGs that bound bivalently 
through crosslinking of neighboring antigens (bottom row) (Fig- 
ure 6A). As expected, saturation by Fabs and IgGs was nearly 
identical for monovalent binding conditions (Figure 6A, first two 
rows). By contrast, across a range of input concentrations, 
there were kg and k^ combinations for IgGs binding bivalently 
that exhibited saturation binding under conditions in which 
monovalent Fabs and IgGs binding monovalently did not (Fig- 
ure 6A, bottom row). Thus, consistent with experimental results 
in the palivizumab/RSV system (Wu et al., 2005), the simula- 
tions suggested that bivalency through crosslinking can rescue 
binding of IgGs whose Fabs exhibit weak binding affinities as a 
result of fast dissociation rate constants, whereas IgGs whose 
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Fabs exhibit high affinities because of siow dissociation rates 
did not dispiay strong avidity enhancement. 

The simuiations aiso demonstrate that the effects of avidity on 
binding are a compiicated mixture of kinetics, input concentra- 
tion, and incubation time. At any particuiar concentration, the 
threshoid at which avidity is observed is controiied by kinetics 
rather than affinity because different combinations of kinetic 
constants yieid the same Kq. The kinetic threshoid at which avid- 
ity effects are observed varies depending on the difference be- 
tween the input concentration and the Kq- For concentrations 
near or below the Kq, there is a kinetic threshold such that, for 
on- and off-rates slower than and s“\ 

respectively, avidity enhancement is not observed (Figure 6A). 
The binding reactions are also affected by the length of incuba- 
tion, such that the lower the input concentration, the longer it 
takes to reach saturation (Figure 6B). 

We note that the simulations model binding interactions 
only, whereas our homo- and hetero-diFabs were evaluated 
for their ability to enhance neutralization of viral infectivity, a 
process more complicated than binding. For example, neutral- 
ization mechanisms may involve conformational changes in Env 
that were not accounted for in our binding simulation. In addi- 
tion, kinetics constants for antibody-mediated neutralization of 
FIIV-1 are not known — nor is the fraction of Env spikes on a 
virion that are required for neutralization or for fusion. In any 
case, it appears that the kinetic properties of the bNAb Fab 
components in our reagents were appropriate to realize avid- 
ity-enhanced neutralization because hetero-diFab reagents 
displayed 00-fold mean improved neutralization potencies. 
The data therefore support the hypothesis that intra-spike 
crosslinking by anti-HIV-1 binding molecules represents a valid 
strategy for increasing potency and resistance to HIV-1 Env 
mutations. 

DISCUSSION 

We engineered HIV-1 spike-binding molecules designed to bind 
bivalently to demonstrate the importance of avidity effects in 
antibody efficacy in HIV-1 neutralization and to establish that 
lack of bivalent binding by physiologic IgGs is an additional anti- 
body evasion strategy utilized by HIV-1 . 

The importance for HIV-1 in maintaining a low spike density 
to avoid inter-spike crosslinking by IgGs was suggested by 
the relatively small improvements in neutralization potencies of 
intact anti-HIV-1 IgGs compared with their Fab counterparts 
(Klein and Bjorkman, 2010) and by the discovery that polyreac- 
tivity increased the apparent affinity of anti-HIV-1 antibodies 
through a mechanism of heteroligation (Mouquet et al., 2010). 
Comparison of the neutralization potencies of IgGs versus 
Fabs in the current study provides further support for the obser- 
vation that anti-HIV-1 IgGs generally exhibit relatively small 
increased potencies compared to Fabs. To quantify potential 
avidity effects, we previously defined the molar neutralization ra- 
tio (MNR) for IgG versus Fab forms of an antibody as IC50 Fab 
(nM)/IC5o IgG (nM). In the absence of avidity or other advantages 
of the IgG compared with the Fab (e.g., increased size), the ratio 
would be 2.0 (Klein and Bjorkman, 201 0). In the current study, the 
mean MNR for PG16, an IgG that cannot exhibit intra-spike 



crosslinking (Julien et al., 2013b), was 8.0 (data from Figure 2C), 
similar to the 10.5 mean MNR in a previous study (West et al., 
2012). These values are lower than MNRs observed for IgGs 
against densely packed viruses, which can be over 1 ,000, but 
are consistent with a limited amount of inter-spike crosslinking 
by anti-HIV-1 IgGs whose epitopes on neighboring spikes 
are accessible to simultaneous engagement of the combining 
sites of the two Fabs of an IgG, which are separated by 
~150 A (Klein and Bjorkman, 2010). Our current results sug- 
gested that inter-spike crosslinking can be increased by creating 
homo-dlFabs with 70 bp-100 bp dsDNA linkers (Figures 2B and 
S2D). These linkers would separate the combining sites of the 
Fabs by >240-340 A, distances that should enhance inter-spike 
crosslinking. 

Indirect evidence for the hypothesis that HIV-1 evolved a low 
spike density to avoid inter-spike crosslinking IgGs comes 
from studies of a cytoplasmic tail deletion in the simian immuno- 
deficiency virus (SIV) spike trimer. Cytoplasmic tail deletion has 
been suggested to increase the number of spikes per virion (Zin- 
gler and Liftman, 1993) and/or the spike mobility in the virion 
bilayer (Crooks et al., 2008), both of which could enhance in- 
ter-spike crosslinking. Although tail-deleted mutant viruses can 
be produced in vitro, propagation ofthe virus in macaques favors 
viruses containing the full-length envelope spike (Zingler and 
Liftman, 1993). These findings are consistent with the idea that 
an intact host immune system selects against those viruses 
that facilitate the ability of host IgGs to bind bivalently through 
inter-spike crosslinking. 

Here, we present a method to create potential intra-spike 
crosslinking antibody-based molecules using dsDNA- and pro- 
tein-based linkers and demonstrate that these reagents can 
exhibit up to three orders of magnitude increases in neutraliza- 
tion potency. We argue that the optimized versions of our engi- 
neered molecules achieve potency increases through intra- 
spike, rather than inter-spike, crosslinking because (1) distances 
measured between epitopes on virion-bound spike trimers cor- 
responded to approximate intra-epitope distances on HIV-1 
spike trimer structures, and (2) increases in inter-spike crosslink- 
ing by homotypic and heterotypic reagents should not exhibit 
sharp linker length-dependent neutralization potencies because 
distances between spikes vary within a single virion and between 
virions. The latter point is valid even if HIV-1 spikes are clustered 
on mature virions, as suggested by fluorescence nanoscopy 
(Chojnacki et al., 2012), but not cryoelectron microscopy (Liu 
et al., 2008; Zhu et al., 2006). Whether HIV-1 spikes cluster 
upon encountering a target cell to form an entry claw (Sougrat 
et al., 2007) is not relevant to the mechanism of action of our 
reagents because neutralization assays are conducted by incu- 
bating potential inhibitors with virions prior to addition of target 
cells (Montefiori, 2005), a mechanism that is also presumably 
relevant for most in vivo interactions of antibody and antibody- 
like inhibitors. Because avidity effects require recognition of 
two or more antigens tethered to the same surface, another po- 
tential action of our reagents, inter-virion crosslinking, would not 
result in avidity effects by analogy to the lack of avidity enhance- 
ment for an IgG binding two soluble antigens, one per Fab. In this 
respect, we note that, although IgAs are capable of inter-virion 
crosslinking (Stieh et al., 2014), conversion of IgG bNAbs to 
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IgAs did not result in potency increases (Kunert et al., 2004; Wol- 
bank et al., 2003). 

The use of dsDNA- and protein-based molecular rules to mea- 
sure inter-epitope distances presented here can be used to 
probe conformations of virion-bound Env trimers. By contrast, 
EM and X-ray structures (Bartesaghi et al., 2013; Julien et al., 
201 3a; Lyumkis et al., 201 3; Pancera et al., 2014) cannot capture 
dynamic information concerning Env conformations during 
neutralization. Single-molecule fluorescence resonance energy 
transfer (smFRET) measurements suggested that Env trimers 
on the surface of HIV-1 virions transition between different con- 
formations (Munro et al., 2014), and spike trimers have been 
visualized by EM in different conformations: the closed structure 
of unliganded trimers and trimers associated with VRCOI-like 
bNAbs (Bartesaghi et al., 2013; Liu et al., 2008; Lyumkis et al., 
2013) (also observed in Fab-bound crystal structures [Julien 
et al., 2013a; Pancera et al., 2014]), a CD4- and/or 17b-bound 
open structure (Liu et al., 2008; Tran et al., 2012), and a partially 
open b1 2-bound structure (Liu et al., 2008) (Figure 3A). Homo- 
and hetero-diFabs joined by different lengths of dsDNA bridges 
offer a new methodology to probe Env trimer conformational 
states on virions and potentially to address strain-specific 
conformational differences. 

Homo-diFabs constructed from VRCOI-like bNAbs showed 
greatest potency when binding to epitopes separated by dis- 
tances most closely approximating the open structure (Liu 
et al., 2008; Tran et al., 2012), rather than the closed structure 
observed for soluble and virion-associated spike trimers bound 
to VRCOI-like Fabs (Bartesaghi et al., 2013; Liu et al., 2008; 
Lyumkis et al., 2013) (Figures 2B, 3, S3, and S4). These results 
suggest that optimal intra-spike crosslinking molecules can 
inhibit a different state than recognized by monovalent Fabs 
binding to spike trimers in static EM and X-ray structures (Barte- 
saghi et al., 2013; Liu et al., 2008; Lyumkis et al., 2013; Merk and 
Subramaniam, 2013; Tran et al., 2012). If so, one Fab of a homo- 
diFab could first bind to its epitope on a closed trimer, allowing 
the second Fab to latch on to a transiently populated open 
form of that trimer. Alternatively, binding of the first Fab may 
trap the trimer into a conformation allowing increased accessi- 
bility of the second Fab, or both Fabs could bind simultaneously 
to a transiently appearing open trimer. Interestingly, the distance 
dependence of two CD4bs antibodies, 3BNC60 and b12, was 
more strongly pronounced for a Tier IB HIV-1 strain, 6535.3, 
than for Tier 2 or 3 strains against which the homo-diFabs 
were tested (Figures 2B and S2). Tier categorization of HIV-1 
strains refers to the sensitivity of a strain to antibody neutraliza- 
tion, with Tier 1 strains being more sensitive in general to anti- 
bodies than Tier 2 or 3 strains (Seaman et al., 2010). The differ- 
ences in length dependence for CD4bs homo-diFabs may 
reflect differences in conformational variability within Env trimers 
from different tiers, with Tier 1 Env perhaps more easily able to 
adopt the open conformations likely recognized by the CD4bs 
antibodies with optimal bridge lengths. 

For the PG16-3BNC60 hetero-diFabs, the optimal 40 bp and 
50 bp bridge lengths (136 A and 170 A, respectively) corre- 
sponded to the approximate separation distances between 
PG16 and 3BNC60 Fabs when bound to the same gp120 within 
a trimer (147 A) or to neighboring protomers within open or 



partially open trimers (167 A) (Figure S3). In a second hetero- 
diFab bridge iength dependency example, 10-1074-40bp- 
3BNC60 was more potent than 10-1 074-60bp-3BNC60 (Figure 4 
and T able S4). The ~ 1 36 A distance between the two Fabs in 1 0- 
1074-40bp-3BNC60 corresponded to the approximate separa- 
tion between these Fabs bound to the same gp120 (141 A), 
whereas 60 bp more ciosely approximated Fabs bound to neigh- 
boring protomers on an open trimer (193 A) (Figure S3). In gen- 
eral, it is more difficult to deduce information about Env trimer 
conformations recognized by hetero-diFabs because the intra- 
epitope distance is the same in the three conformations for 
Fabs binding to the same gp120 subunit within an Env trimer 
(Figure S3), and length-dependence data for some of the het- 
ero-diFabs (e.g., 10-1074-40bp-3BNC60) were consistent with 
binding to a single gpl 20 within an Env trimer, as well as to adja- 
cent gp120s (Figure S2). However, whether binding to the same 
or to adjacent protomers within the spike trimer, the increased 
synergy of optimal hetero-diFabs suggested a mechanism in 
which the more potent/tighter-binding Fab of the hetero-diFab 
initially bound to the viral spike, thereby allowing the second 
Fab, even when only weakly neutralizing on its own, to attach. 

In summary, our results demonstrated that optimal length 
homo- and hetero-diFabs are capable of synergistic effects 
that increased neutralization potencies and, in some cases, al- 
iowed neutralization of viral strains resistant to conventional 
IgGs. These results are consistent with the hypothesis that 
most anti-HIV-1 IgGs bind monovalently to single Env spikes, 
which leaves them vulnerable to Env mutations that weaken 
monovalent interactions but would still permit bivalent interac- 
tions (Klein and Bjorkman, 2010). The demonstration that anti- 
HIV-1 reagents designed to be capable of intra-spike binding 
with avidity can more potently and broadly neutralize HIV-1 
than conventional anti-spike IgGs is relevant to the choice of 
anti-HIV-1 proteins or genes to be delivered passively to prevent 
infection or suppress active infections. Bi-specific antibodies 
that simultaneously bind to HIV-1 Env and to CD4 or CCR5 
host receptors on the target cell represent a conceptually distinct 
method to increase the potency and breadth of anti-HIV-1 re- 
agents (Pace et al., 2013). In contrast to these reagents, anti- 
bodies that achieve synergy via bivalent binding to Env by 
intra-spike crosslinking offer significant advantages for passive 
delivery; for example, neutralizing antibodies against HIV-1 Env 
protect more effectively in vivo than antibodies against CD4 
(Pegu et al., 2014), and anti-self antibodies such as anti-CD4 
IgGs have short half-lives in vivo (Bruno and Jacobson, 2010). 
We propose that the ideal therapeutic molecule would utilize 
avidity achieved by intra-spike crosslinking to reduce the con- 
centration required for sterilizing immunity and render the low 
spike density of HIV-1 irrelevant to its efficacy. Moreover, analo- 
gous to using several drugs or antibodies during anti-retroviral 
therapy, simultaneous binding to different HIV-1 epitopes should 
reduce or abrogate sensitivity to Env mutations. 

EXPERIMENTAL PROCEDURES 
Expression and Purification of Fabs 

Genes encoding IgG light-chain genes were modified by site-directed muta- 
genesis to repiace Cys263ught chain, the C-terminal cysteine that forms a 
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disulfide bond with Cys233Heavy chain, with a serine. Modified light-chain genes 
and genes encoding 6x-His- or Strepll-tagged Fab heavy chains {Vh-Ch 1 -tag) 
were subcloned separately into the pTT5 mammalian expression vector (NRG 
Biotechnology Research Institute). Fabs were expressed by transient transfec- 
tion in HEK293-6E (NRC Biotechnology Research Institute) cells as described 
{Diskin et al., 2011) and purified from supernatants by Ni-NTAor Strepll affinity 
chromatography followed by size exclusion chromatography in PBS pH 7.4 
using a Superdex 200 10/300 or Superdex 200 16/600 column (Amersham 
Biosciences). 

IgG heterodimers were expressed and purified as described in the Extended 
Experimental Procedures. 

DNA Conjugation to Fabs 

DMA was conjugated to free thiol-containing Fabs using a modified version of a 
previously described protocol (Hendrickson et al., 1995). Briefly, Fabs were 
reduced in a buffer containing 10 mM TCEP-HCI (pH 7-8) for 2 hr and then 
buffer exchanged three times over Zeba desalting columns (Thermo Scienti- 
fic). The percentage of reduced Fab was determined using Invitrogen’s Mea- 
sure-lT Thiol Assay. Concurrently, a 5-20 base ssDNA containing a 5' amino 
group (Integrated DNA Technologies, IDT-DNA) was incubated with a 100- 
fold molar excess of an amine-to-sulfhydryl crosslinker (Sulfo-SMCC; Thermo 
Scientific) for 30 min to form a maleimide-activated DNA strand, which was 
buffer exchanged as described above. The reduced Fab and activated ssDNA 
were incubated overnight, and the Fab-ssDNA conjugate was purified by Ni- 
NTA or Strepll affinity chromatography (GE Biosciences) to remove unreacted 
Fab and ssDNA. 

ssDNA was synthesized, phosphorylated, and PAGE purified by Integrated 
DNA Technologies. For di-Fabs containing dsDNA bridges longer than 40 bp, 
complementary ssDNAs were annealed by heating (95°C) and cooling (room 
temperature) to create dsDNA containing overhangs complementary to the 
Fab-ssDNA conjugates. dsDNA was purified by size exclusion chromatog- 
raphy (Superdex 200 10/300) and incubated overnight with the correspond- 
ing tagged Fab-ssDNA conjugates. Homo- and hetero-diFab reagents were 
purified by Ni-NTA and Strepll affinity chromatography when appropriate to 
remove free DNA and excess Fab-ssDNA conjugates, treated with T4 DNA 
ligase (New England Biolabs), and purified again by size exclusion chroma- 
tography (Figure SIB). To make di-Fabs containing dsDNA bridge lengths 
less than 40 bp, two complementary ssDNA-conjugated Fabs were incu- 
bated at 37°C without a dsDNA bridge and then purified as described above. 
Protein-DNA reagents were stable at 4°C for >6 months as assessed by 
SDS-PAGE. 

Bridge and linker sequences are listed in Extended Experimental 
Procedures. 

Characterization of DNA-Fab Reagents 

Fractions from the center of an SEC elution peak were concentrated using 
Amicon Ultra-15 Centrifugal Filter Units (Millipore) (MW cutoff = 10 kDa) to a 
volume of 500 ).lI, and DLS measurements were performed on a DynaPro 
NanoStar (Wyatt Technology) using the manufacturer’s suggested settings. 
Hydrodynamic radii were determined as described (Dev and Surolia, 2006). 
Briefly, a nonlinear least-squares fitting algorithm was used to fit the measured 
correlation function to obtain a decay rate. The decay rate was converted to 
the diffusion constant that can be interpreted as the hydrodynamic radius 
via the Stokes-Einstein equation. 

Hetero-diFab with TPR Linker 

PG1 6-TPR1 2-3BNC60, a C-to-C linked hetero-diFab containing 1 2 consensus 
TPR domains (Kajander et al., 2007) as a protein linker (Klein et al., 2014), was 
prepared from modified PG1 6 and 3BNC60 Fabs using a combination of sor- 
tase-catalyzed peptide ligation and click chemistry (Witte et al., 2013). The C 
terminus of the PG16 Fab heavy chain was modified to include the amino 
acid sequence GGGGAS LPETG GLNDIFEAQKIEWHEHHHHHH. comprising 
a flexible linker, the recognition sequence forS. aureus Sortase A (underlined), 
a BirA tag, and a 6x-His tag. The C terminus of the 3BNC60 Fab heavy-chain C 
terminus was modified to include a (Gly 4 Ser )3 linker followed by 12 tandem 
TPR domains and the amino acid sequence ASGGGGSGGGGSGGGGS 
LPETG GHHHHHH, comprising a second (Gly 4 Ser )3 linker, the Sortase A 



recognition sequence (underlined), and a 6x-His tag. The Fabs were expressed 
in HEK-6E cells and purified with Ni-NTA and gel filtration chromatography as 
described above. Peptides (GGGK with C-terminal azide and cyclooctyne 
click handles) were synthesized by GenScript, and sortase-catalyzed peptide 
ligation was used to attach the azide-containing peptide to PG1 6 Fab and the 
cyclooctyne-containing peptide to the 3BNC60-TPR12 fusion protein as 
described (Guimaraes et al., 2013). Approximate yields after each sortase re- 
action were ~30%. Peptide-ligated PG16 and 3BNC60 Fabs were passed 
over a Ni-NTA column to remove His-tagged enzyme and Fabs that did not 
lose their His tags during the reaction, mixed at equimolar ratios, and the click 
reaction was accomplished by incubating overnight at 25°C. The approximate 
yield for the click reaction was ~65%. The resulting PG16-TPR12-3BNC60 
hetero-diFab was purified by size exclusion chromatography to remove un- 
reacted Fabs for an overall yield of ~22%. 

Measurements of Intra-Spike Distances 

To derive predicted distances between two adjacent Fab bound to HIV-1 Env, 
we superimposed Fabs bound to their epitopes on the structures of Env tri- 
mers in three different conformations: closed (a 4.7 A crystal structure of a 
gp140 SOSIP trimer; PDB code 4NCO), open (a 9 A EM structure of a SOSIP 
trimer-17b Fab complex (Tran et al., 2012); coordinates obtained from S. Sub- 
ramaniam), and partially open (an ~20 A EM structure of a viral spike bound to 
b12 Fab; PDB code 3DNL). The positions of the ChI and Cl domains in Fab 
structures used for docking were adjusted to create Fabs with the average 
elbow bend angle found in a survey of human Fab structures (Stanfield 
et al., 2006). The Vh-Vl domains of the adjusted Fabs were then superimposed 
on crystal structures of Fab-gp120 or Fab-gp140 complexes (PDB codes 
3NGB, 2NY7, and 4CNO for complexes with VRC01, b12, and PGT122 
Fabs, respectively) or a PG1 6-epitope scaffold complex (PDB code 4DQO). 
The position on Env trimer of 10-1074, a clonal variant of the PGT121- 
PGT123 family (Mouquet et al., 2012), was approximated using the 4CNO 
gp140-PGT122 structure. In other cases, related antibodies, e.g., PG9/ 
PG16 and VRC01/3BNC117/3BNC60, were also assumed to bind similarly. 
The complex structures were superimposed on the Env trimer structures 
by aligning the common portions. The distance between the Cys233heavy chain 
carbon-a atoms of adjacent Fabs was then measured using PyMol (Schro- 
dinger, 2011) to approximate the length of dsDNA bridges attached to 
Cys233^^^ Measurements derived using other EM structures for the closed 
and open trimers (PDB codes 3DNN, 3J5M, and 3DNO) or using a recent 3.5 A 
Env trimer crystal structure (Pancera et al., 2014) resulted in differences of < 
10 A for analogous distance measurements. 

In Vitro Neutralization Assays 

Neutralization of pseudoviruses derived from primary HIV-1 isolates was moni- 
tored by the reduction of HIV-1 Tat-induced luciferase reporter gene expres- 
sion in the presence of a single round of pseudovirus infection in TZM-bl cells 
as described previously (Montefiori, 2005) and in the Extended Experimental 
Procedures. 

Simulation of Fab and IgG Saturation of Surface-Bound Antigens 

Numerical analysis (Mathematica, v. 10) was used to simulate saturation of 
surface-bound antigens by monovalent Fabs (Equation 1 ), bivalent IgGs to un- 
paired antigen {Ag) (Equation 2), and paired antigen ipAg) (Equations 3 and 4), 
where “paired antigen” was defined as antigens that are spaced such that an 
IgG can bind two epitopes simultaneously (e.g., intra-spike crosslinking of two 
epitopes on the same viral spike or inter-spike crosslinking between two viral 
spikes). In the bivalent model (Equations 3 and 4), the surface concentrations 
of antigen and IgG-antigen complexes were approximated by the inverse of 
the volume of a sphere (Vg) with radius equal to the hydrodynamic radius of 
the molecule multiplied by Avogadro’s number (A/a) as described previously 
(Muller et al., 1998). 

Fab binding to antigen: 

Fab +Ag^ Fab - Ag 

^^^^^—^^ka[Fab][Ag] - kd[Fab - Ag] (Equation 1) 
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IgG binding to unpaired antigen: 

lgG+Ag:^lgG -Ag 

d[lgG^-Ag] ^ (Equation 2) 

IgG binding to paired antigen: 

lgG+pAg:^lgG - pAg 

IgG - pAg + pAg s± /gG - pAg 2 

^ ^ ^ (Equation 3) 

d|/gG ^pAg 2 ] ^ ^ ^ 

(Equation 4) 
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SUMMARY 

Decreases in the diversity of enteric bacterial popula- 
tions are observed in patients with Crohn’s disease 
(CD) and ulcerative colitis (UC). Less is known about 
the virome in these diseases. We show that the 
enteric virome is abnormal in CD and UC patients. 
In-depth analysis of preparations enriched for free 
virions in the intestine revealed that CD and UC 
were associated with a significant expansion of 
Caudovirales bacteriophages. The viromes of CD 
and UC patients were disease and cohort specific. 
Importantly, it did not appear that expansion and 
diversification of the enteric virome was secondary 
to changes in bacterial populations. These data sup- 
port a model in which changes in the virome may 
contribute to intestinal inflammation and bacterial 
dysbiosis. We conclude that the virome is a candi- 
date for contributing to, or being a biomarker for, hu- 
man inflammatory bowel disease and speculate that 
the enteric virome may play a role in other diseases. 

INTRODUCTION 

Inflammatory bowel disease (IBD) is a complex, remitting and 
relapsing Inflammatory disease with genetic and environmental 
risk factors. One environmental contributor Is thought to be mi- 
croorganisms that live in the intestine (Geverset al., 2014; Kostic 
et al., 2014; Minot et al., 2011; Norman et al., 2014; Virgin, 2014). 
Of these microorganisms, bacteria have gained the greatest 

CrossMark 



attention and are linked to training mucosal immunity and 
minimizing mucosal inflammation (reviewed in Belkaid and 
Hand [2014]). An aberration in either of these immune pro- 
cesses can have detrimental consequences for IBD progression. 
For example, a reduction of Bacteroidetes and Firmicutes and 
expansion of normally less abundant bacterial taxa (dysbiosis), 
as well as changes in bacterial microblome function, have 
been associated with both Crohn’s disease (CD) and ulcerative 
colitis (UC) (Kostic et al., 2014; Stappenbeck etal., 2011). Impor- 
tantly, household contacts without IBD can also exhibit signs of 
bacterial dysbiosis (Joossens et al., 2011). These Individuals 
have increased intestinal permeability compared to healthy com- 
munity controls (Hollander et al., 1986), suggesting that the bac- 
terial microblome Is heavily influenced by the household environ- 
ment. Investigations have also shown that the home environment 
Is a primary determinant of the individual’s bacterial microblome 
and that humans are the primary vector of bacterial transmission 
between people living within the same household (Lax et al., 
201 4). Exchange of viruses between humans within a household 
has not been thoroughly investigated. Nevertheless, investiga- 
tions of the bacterial microblome and the enteric virome in IBD 
are likely to be optimized by the Investigation of household con- 
trols rather than matched controls from different households. 

Emerging data Indicate that the viral component of the micro- 
biome, termed the virome, can profoundly influence host physi- 
ology (Handley et al., 2012; Norman et al., 2014; Virgin, 2014). 
Recent advances in sequencing technology have led to the dis- 
covery of a diverse enteric human virome consisting of bacterio- 
phages, as well as eukaryotic viruses (Breitbart et al., 2003; Flnk- 
belner et al., 2008; Minot et al., 2011, 2012, 2013; Reyes et al., 
2010). Importantly, evidence that eukaryotic viruses can interact 
with IBD risk genes to alter Intestinal disease comes from studies 
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of mice carrying mutations in 11-10 or Atg16l1, indicating that 
members of the virome may contribute to iBD (Basic et ai., 
2014; Cadweii et ai., 2010; irving and Gibson, 2008; Sun et ai., 
2011). Bacteriophages may aiso piay a direct roie in intestinai 
physioiogy or change the bacteriai microbiome through pred- 
ator-prey reiationships (Barr et ai., 2013; Duerkop et ai., 2012; 
Reyes et ai., 2013; Wiiiner et ai., 2009, 2012). 

The virome, much of which is composed of bacteriophages, 
contains the most diverse genetic eiements on earth and is 
oniy beginning to be characterized at the sequence ievei (Virgin, 
2014). In the absence of disease, enteric bacteriophage popuia- 
tions exhibit significant diversity between individuais and are 
temporaiiy stabie (Minot et ai., 2013; Reyes et ai., 2010). Bacte- 
riophages in the heaithy human intestine are predominantiy 
temperate doubie-stranded DNA (dsDNA) Caudovirales or sin- 
gie-stranded DNA (ssDNA) Microviridae that iatentiy infect their 
bacteriai hosts and generate few virai progeny that may infect 
and kiii other bacteria (Minot et ai., 2011, 2013; Minot et ai., 
201 1 ; Reyes et ai., 2010; Wailer et ai., 2014). However, environ- 
mentai stimuii, such as nitric oxide and antibiotics, induce the 
production of infectious bacteriophages that iyse their bacteriai 
host and infect neighboring celis bearing specific receptors 
(Lindsay et al., 1998; Maiques et ai., 2006; Zhang et ai., 2000; 
Zhang and LeJeune, 2008). This process reieases infectious vi- 
rions into the intestine, which can be purified and anaiyzed. Alter- 
ations in bacteriophage abundance have been suggested in CD 
(Lepage et ai., 2008; Perez-Brocai et ai., 2013; Wagner et ai., 
2013); however, these studies did not characterize the enteric vi- 
rome in detaii and did not controi for factors within househoids 
that may infiuence the microbiome. 

Here, we characterized the normai human and iBD enteric 
virome by metagenomic sequencing of the DNA of virus-iike 
particie (VLP) preparations from fecai sampies obtained from 
UC and CD patients and controis. Throughout the text, we refer 
to two ecoiogicai metrics, richness (the number of taxa counted 
per sampie) and diversity. Diversity measures both richness 
and the reiative abundance (or evenness) of the taxa present; 
changes in diversity can resuit from aiterations in either richness 
or evenness. Detaiied anaiysis of purified virions in VLP prepara- 
tions and bacteriai 16S ribosomai RNA sequences from a iongi- 
tudinai patient cohort compared to househoid controis reveaied 
the expected decrease in bacteriai richness and diversity 
accompanied by a striking iBD-associated increase in bacterio- 
phage richness. These findings were vaiidated in two indepen- 
dent and geographicaiiy distinct patient cohorts that contained 
matched controis. The taxonomic substructure of the enteric 
virome and bacteriai microbiome in CD and UC showed 
geographic variation in the specific bacteria and viruses 
detected. Together, these data support a modei in which IBD- 
associated increases in bacteriophage richness are not mereiy 
accounted for by an increase in their bacteriai host celis. We 
observed both positive and negative correiations between spe- 
cific virai and bacteriai taxa. These data demonstrate, for the first 
time, that unique changes in the bacteriophage component of 
the enteric virome occur in CD and UC, raising the possibiiity 
that these changes may contribute to disease pathogenesis, 
perhaps through a predator-prey reiationship between bacterio- 
phages and their bacteriai hosts. These data provide a rationaie 



for considering virome diagnostics for IBD and manipulation of 
the enteric virome as a novei therapeutic strategy for the man- 
agement of IBD and emphasize the need for a greater under- 
standing of transkingdom interactions within the microbiome 
for other diseases associated with changes in the bacteriai 
microbiome (Duerkop and Hooper, 2013; Norman et ai., 2014; 
Virgin, 2014). 

RESULTS 

Virome Alterations Are Observed in Multiple Cohorts 

To initiaiiy define the enteric virome associated with IBD, we 
performed metagenomic sequencing of stooi fiitrates using the 
Roche 454 piatform on three independent cohorts consisting 
of iBD and non-IBD househoid controis (Tabies SI and S2; 
Cambridge, United Kingdom; Chicago, USA; and Los Angeies, 
USA). On average, we obtained 32,591 ± 27,531 sequences 
(number ± SD) that were 282 ± 47 nucieotides in iength from 
72 fecai sampies (Tabie S3; 12 househoid controis, 18 CD, and 
42 UC). Sequences were demultipiexed, quaiity filtered, and as- 
signed taxonomy (Suppiementai information). The majority of 
sequences obtained were assigned to the human host or to bac- 
teriai taxa (Tabie S3). Consistent with previous reports, bacterio- 
phages of the Caudovirales order and Microviridae famiiy were 
the most abundant virai taxa identified in ali three cohorts (Fig- 
ure 1A) (Minot et ai., 2011; Reyes et ai., 2010). Other viruses 
were detected in a iimited number of sampies and represented 
an average of five percent or fewer of the totai virai sequences. 
An anaiysis of the relative abundances of sequences from 
aii three cohorts reveaied an inverse correiation between the 
Caudovirales and Microviridae (Figure 1 B). This inverse correia- 
tion was aiso present when controis, CD, and UC were anaiyzed 
separateiy (data not shown). Comparing househoid controis to 
iBD sampies within the UK cohort reveaied that this dispropor- 
tionate representation of bacteriophage abundance was associ- 
ated with IBD (Figure 1C). Disparate ratios of Caudovirales and 
Microviridae were also observed in patients from Los Angeies. 
This suggestive correiation between disease and a change 
in sequences from the enteric virome was striking given the 
geographicai and environmentai diversity of the cohorts. 

In-Depth Analysis of Free Virions in the Enteric Virome 
in IBD 

These initiai observations prompted us to perform an in-depth 
anaiysis of the virome by metagenomic sequencing of VLPs 
purified from the feces of patients and controis from 17 iBD 
househoids in the UK (Figure 2A and Tabies SI and S2). VLP pu- 
rification enriches for free virions (Reyes et ai., 2012; Thurber 
et ai., 2009). To further refine the reiationship between iBD and 
the enteric virome and to take into account prior data indicating 
that the bacteriai microbiome and virome are simiiar within 
househoids (Lax et ai., 2014; Reyes et ai., 2010), we compared 
iBD patients to matched househoid controis (Figure 2A). This is 
particulariy important for virome anaiysis given the high interper- 
sonai variation of viruses (Reyes et ai., 2010). Sampies were 
coiiected from both the iBD patient and househoid controi at 
the time of a ciinicai fiare of disease (Suppiementai Information), 
in totai, 21 househoid controi sampies and 52 IBD samples 
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Figure 1. Virus Taxonomic Assignment and 
Imbalance in IBD 

(A) Relative abundance of sequences assigned to 
the indicated viral taxa. Error bars represent the 
mean ± SD. 

(B) Correlation plot of the Caudovirales and 
Microviridae relative abundance for all samples. 
Linear regression ± 95% confidence interval and 
Spearman correlation coefficient are shown. 

(C) Microviridae and Caudoviraies relative abun- 
dance for United Kingdom household controls, 
UC and CD (top); Los Angeles and Chicago IBD 
(bottom). The bars indicate the median and inter- 
quartile range. Statistical significance was deter- 
mined by the Mann-Whitney test. 

See also Tables SI , S2, and S3. 
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(24 active disease and 28 inactive disease sampies; 36 totai UC 
and 16 totai CD) were used to isoiate VLPs for sequencing. We 
vaiidated observations in two additionai cohorts from Chicago 
and Boston that contained CD and UC patients and matched 
heaithy controi subjects (described in greater detaii beiow; 
Tabies SI and S2). 

For the UK cohort, we obtained 1,111,569 ± 493,164 paired- 
end sequences per sampie with an average sequence quaiity 
of 36.5 ± 3.7 (Tabie S4 and Experimental Procedures). Quaiity 
controi trimming resuited in an average of 2% reduction in the 
number of sequences to 1 ,094,360 ± 503,337 with an average 
reduction in sequence iength from 250 to 241.7 ± 3.7 bases 
and an average increase in quaiity score of 1.0 ± 0.03. There 
were no statisticaiiy significant differences in the number of totai 
or quaiity-controiied sequences obtained between controi and 
IBD patient sampies (Figure SI A). Sequences were ciustered 
at 95% identity to remove simiiar sequences and to generate 
sequences termed unique hereafter. Ciustering resuited in an 
average reduction of 89% to 1 1 2,1 92 ± 71 ,820 unique sequences 
per sampie. interestingiy, despite the fact that there were no sig- 
nificant differences in the totai sequences obtained, we detected 
a significant increase in unique reads in CD patients compared to 
either househoid controis or UC patients. (Figure SI A). 

Increased Caudovirales Taxonomic Richness 
Associated with IBD 

Unique sequences were mapped to a custom virus protein data- 
base, as described in the Experimentai Procedures. We were 



abie to assign virai taxonomy to an 
average of 17,022 ± 15,725 sequences 
(15%) (Figure SI A), with the majority 
beionging to Caudovirales and Micro- 
viridae bacteriophages (Figure SIB). 
Sequences were aiso assigned to 
many iess-abundant virai taxa, inciuding 
bacteriophages whose annotated hosts 
inciude bacteria commoniy found in hu- 
man fecai samples and common eukary- 
otic viruses (Figure SI B). Anaiysis of our 
VLP sequences reveaied a iow ievei of 
contamination with human sequences 
(0%- 4%). Possibie contamination with bacteriai sequences 
was confounded by the presence of integrated prophages in 
fuli genome sequences of bacteria. We mapped a subset of 
our totai VLP enriched sequences from the UK cohort to the 
recentiy discovered crAssphage genome and detected it in 
71 % of our sampies (Dutilh et ai., 2014) (Tabie S4). The percent- 
age of sequences that mapped to this virus varied greatly (range 
1%-89%) and did not correiate with disease status or drug 
treatment. 

Interestingiy, we observed an increase in the richness of bac- 
teriophages, specif icaiiy, members of the Caudovirales in iBD 
(Figures 3A and 3B). It is uniikeiy that these differences can be 
attributed to an uneven number of sampies or unique sequences 
in controi and disease groups because the rate of acquisition of 
new bacteriophage taxa in disease sampies rapidiy outpaces 
new taxa acquisition in controi sampies (Figure 3B). No in- 
creases in Microviridae richness or diversity were observed (Fig- 
ure S2A), indicating that bacteriophage expansion was restricted 
to certain taxa and that our methods do not systematicaily in- 
crease bacteriophage richness in iBD sampies compared to 
controls. Some IBD samples had many fewer Microviridae than 
controls, further supporting that the viromes were different be- 
tween the two disease states (Figure S2A). 

Taxonomic assignment of de novo assembied contigs ionger 
than 1 ,000 nucieotides aiso indicated that Caudovirales were en- 
riched in IBD (Figure 3C). These data indicated that the VLP 
preparations contained partiai or compiete virus genomes and 
that the overwheiming majority of those assignabie genomes 
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Figure 2. In-Depth, Longitudinal Cohort Graphical Timeline 

(A) United Kingdom IBD stool samples and non-IBD household control stools were collected at the onset of symptoms (flare) and collected as symptoms resolved 
where indicated. The length of each IBD flare is indicated by the red shaded oval. The samples are annotated by household ID and sample number. 

(B) Chicago UC samples were collected first during inactive disease and again when symptoms exacerbated. The samples are annotated by the subject and 
sample number. 

See also Tables S1 and S2. 



were from Caudovirales bacteriophages. Taken together, Cau- 
dovirales sequences and assembled contigs were differentially 
expanded in CD and UC compared to household controls and 
were therefore capable of having disparate effects on the micro- 
biome and immune responses. 

Disease-Specific Changes in the Enteric Virome 

In addition to differences in bacteriophage richness between 
IBD and household control samples, we also observed striking 
differences in richness and the types of bacteriophage taxa 
observed between CD and UC samples (Figure 3D). A substan- 
tial number of taxa were observed among all samples; however, 
each disease type harbored unique bacteriophages. These 



differences in Caudovirales taxa could be the result of multiple 
factors, including health or disease, household, duration of 
cohabitation, disease activity, age, sex, and/or drug treatment. 
We therefore performed a multivariate analysis of Caudovirales 
abundance using MaAsLin (Multivariate Association with Linear 
Models) on VLP sequences compared to their household 
controls (Morgan et al., 2012). MaAsLin identified 35 different 
Caudovirales that were significantly associated with different 
households (43 total associations) (Tables S5 and S6). This rela- 
tionship was also revealed by plotting Caudovirales relative 
abundances, which indicate conservation within households 
(Figure 3E). This finding is consistent with a previous report 
(Reyes et al., 2010). Five of the Caudovirales identified by 
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Figure 3. Bacteriophage Expansion Is Associated with CD and UC 

(A) Presence-absence heat map of the sequences assigned to Caudovirales taxa in VLP preparations from UK household control, CD and UC stool samples. 
(B and C) Rarefaction curves of Caudovirales richness versus an increasing number of subsamplings with replacement. 

(B) Caudovirales richness based on individual sequences. 

(C) Richness based on assembled Caudovirales contigs. The curves represent the average of 500 iterations at each depth of samples. 

(D) Venn diagram of the Caudovirales taxa in household control, CD and UC samples, n indicates the number of samples within each subgroup. 

(E and F) Plots of the relative abundance of the 35 most abundant Caudovirales taxa in the UK UC and CD households. Bars are annotated by the household ID 
and sample number. Green numbers indicate household controls; purple numbers indicate CD; red numbers indicate UC. 

See also Figure S1 and Tables S4, S5, and S6. 
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MaAsLin were significantly associated with disease, including 
sequences most closely related to Lactococcus, Lactobacillus, 
Clostridium, Enterococcus, and Streptococcus bacteriophages 
(Table S6). The increased Caudovirales richness in CD samples 
relative to household controls corresponded with increased 
bacteriophage diversity (Figure S2B). This was not observed 
in DC, further highlighting differences in the virome between 
UC and CD. 

Inverse Correlation between IBD-Associated Changes 
in the Virome and Bacterial Microbiome 

We next assessed whether the increase in bacteriophage rich- 
ness observed in CD and UC was associated with a parallel 
change in bacterial populations. To answer this question, we 
performed bacterial 16S rRNA gene sequencing (Tables S1 
and S2). On average, we obtained 64,107 ± 38,570 sequences 
that were clustered at 97% identity into 56,096 ± 32,774 
operational taxonomic units (OTUs) per sample (Table S7). As 
expected, CD and UC were associated with a significant reduc- 
tion in bacterial diversity and bacterial richness compared to 
household controls (Figures 4A and 4B). Flowever, we also 
observed significant similarity in the bacterial microbiome 
within IBD households as determined by permutation multivar- 
iate analysis of variance of the weighted UniFrac distances 
(Figures 4B and 4C; ADONIS p = 0.001, 999 permutations) 
(McArdle and Anderson, 2001). Interestingly, in patients that 
were sampled longitudinally, the bacterial diversity did not 
recover during periods of disease inactivity in either CD or UC 
(Figures 4B and 4C). 

Like the shifts observed for bacteriophages, the changes in 
bacterial diversity could be multifactorial. We therefore per- 
formed a multivariate analysis of bacterial abundance using 
MaAsLin (Morgan et al., 2012). We identified 18 different 
bacterial taxa that were significantly associated with disease or 
disease activity (Tables S5 and S8). The vast majority of the sig- 
nificant OTUs were of the Bacteriodetes and Firmicutes phyla, 
including significant differences in members of the bacterial fam- 
ilies Ruminococcaceae, Lachnospiraceae, Bacteriodaceae, and 
Prevotellaceae. Therefore, and as expected, disease-specific 
alterations in bacterial taxa were observed in our cohort; how- 
ever, the variation was also largely linked to specific households, 
supporting previous studies of the microbiome that have indi- 
cated similarities among IBD patients and their household 
contacts (Joossens et al., 201 1). 

Importantly, the observed reduction in bacterial diversity was 
inversely related to the expansion of Caudovirales bacterio- 
phages in IBD. Comparing the bacterial and Caudovirales 
bacteriophage communities in household controls, CD and UC 
samples indicated clear differences between the disease states 
(Figure 5A). Among the UK cohort, significant positive correla- 



tions were observed in five out of six possible comparisons 
between bacterial diversity or richness and Caudovirales rich- 
ness or diversity. However, a significant inverse correlation 
was observed in CD samples between Caudovirales diversity 
and both bacterial richness and diversity, which suggests that 
bacteriophage expansion was not simply the result of increases 
in their bacterial hosts (Figure 5A). 

To further characterize the relationships between Caudovir- 
ales and bacterial taxa, we calculated the Spearman correlation 
between the Caudovirales taxa and the bacterial families found 
to be significantly altered in disease by MaAsLin analysis. Similar 
to the overall diversity and richness correlations, inverse correla- 
tions between Caudovirales and the significantly altered bacte- 
rial taxa are prevalent in UK CD patients (Figure 5B). In particular, 
the Bacteroldaceae bacterial families were inversely correlated 
with several Caudovirales taxa in CD. This corresponded with 
a reduction in the relative abundance of these bacterial taxa 
in CD patients compared to household controls. In contrast, 
the Caudovirales were positively correlated with the Enterobac- 
teriaceae and Pasteurellaceae bacterial families in CD; these 
bacterial families were increased in abundance in CD patients 
(Figure 5B). Positive correlations were also observed between 
the Caudovirales and Prevotellaceae in CD; however, there 
were no changes in the relative abundance of Prevotellaceae in 
CD, further indicating that we were not merely sequencing pro- 
phages. Fewer positive or negative correlations were observed 
in UC patients despite the significant expansion of Caudovirales 
bacteriophages and decreased bacterial diversity in these pa- 
tients (Figures 3B and 4A), further suggesting the existence of 
disease-specific elements of the virome-bacterial microbiome 
relationship in UC versus CD. 

Independent IBD Cohorts for Validation of Virome 
Findings 

We considered the possibility that our observations in UK IBD 
patients were either geographically determined or not reproduc- 
ible. Therefore, we performed bacterial 1 6S and VLP sequencing 
of stool samples from two additional cohorts of CD and UC pa- 
tients (Figure 2 and Tables SI and S2). In these cases, household 
controls were not available, and so matched healthy controls 
were used. The first validation cohort was from Chicago and 
included 23 healthy control samples and 25 IBD samples 
(18 UC and 7 CD), including several samples that were part of 
the initial 454 analyses (Figure 1). For five of the Chicago UC 
patients, we were able to acquire samples both during inactive 
and active disease (Figure 2B). Two household controls that 
matched UC patients from Chicago were included in these ana- 
lyses as healthy subjects. The second validation cohort was from 
Boston and included 10 healthy control samples and 25 IBD 
samples (11 UC and 14 CD). 



Figure 4. Alterations in the Bacterial Community Composition in IBD 

(A) Alpha diversity (left) and bacterial species richness (right) based on 1 6S rRNA gene sequences in the stool of household controls, CD, and UC patients. Error 
bars indicate the mean ± SD. Statistical significance was determined by the Kruskal-Wallis test with Dunn’s correction comparing all samples to all samples. 

**p>0.01. 

(B and C) Plots of alpha diversity normalized to the diversity in household controls (top) and relative bacterial family abundance (bottom) of UK (B) CD households 
and (C) UC households. Green numbers indicate household controls; purple numbers indicate CD; red numbers indicate UC. 

See also Tables S5, S7, and S8. 
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Figure 5. Disease-Specific Bacteria-Caudovirales Patterns in IBD 

(A) Spearman correlation plot of Caudovirales richness, Caudovirales Shannon diversity, bacterial alpha diversity, and bacterial species richness for UK 
household control, CD, and UC samples. Statistical significance was determined for all pairwise comparisons; those with p values < 0.05 are indicated. Positive 
values (blue circles) indicate positive correlations, and negative values (red circles) indicate inverse correlations. The size and shading of the circle indicate the 
magnitude of the correlation where darker shades are more correlated than lighter shades. 

(B) Spearman correlation plots of the relative abundances of the 50 most abundant Caudovirales taxa and bacterial families identified to be significantly 
associated with disease. UK household control (top), CD (middle), and UC (bottom) samples. The gray bars indicate any taxa that were not detected in the cohort 
subgroup. Statistical significance was determined for all pairwise comparisons; only significant correlations (p value < 0.05) are displayed. 

See also Figure S2. 



Relationship between Caudovirales Richness and IBD 
across Cohorts 

As expected, significant reductions in bacteriai diversity and rich- 
ness were observed in CD and UC patients compared to heaithy 
controis from both the Chicago and Boston vaiidation cohorts 
(Figures 6A-6C). MaAsLin anaiysis of bacterial taxa revealed 
significant associations with IBD and disease activity in both Chi- 
cago and Boston cohorts, although many fewer bacterial taxa 
were significantly associated with IBD than observed in the UK 
cohort, which included household controls (Tables S5 and S8). 

As observed in the UK cohort (Figure 3), a significant expan- 
sion of Caudovirales bacteriophages was observed in IBD 
patients from both validation cohorts (Figure 7). However, the 
specific relationships between bacteriophage richness and dis- 
ease varied between cohorts. In Chicago, CD patients were Cau- 
dovirales rich compared to healthy controls (Figures 7A and 7B). 
This was evident when richness was assessed for both individual 
sequences and Caudovirales contigs. In the Boston cohort, both 
CD and UC patients had increased Caudovirales, with UC con- 
tigs being more enriched than CD (Figures 7A and 7B). We 
also detected unique Caudovirales taxa in CD and UC samples 



from both the Chicago and Boston cohorts (Figure 7C), which 
was anticipated given our observations in the UK samples and 
the high inter-individual virome diversity reported previously 
(Reyes et al., 2010). Multivariate analysis of the relative abun- 
dance of Caudovirales in the Chicago and Boston samples 
revealed several disease-specific associations (Tables S5 and 
S6). We were unable to complete a correlation analysis to asso- 
ciate bacteriophage to bacteria abundances (as in Figure 5) due 
to the inadequate number of bacterial taxa associated with dis- 
ease diagnosis as determined by MaAsLin analysis (Table S8). 
Therefore, we were unable to validate specific relationships be- 
tween bacteriophage and bacterial taxa across our cohorts. 
Together, these data indicate that Caudovirales bacteriophages 
were expanded in both CD and UC patients compared to house- 
hold or healthy controls from three independent cohorts. 

Eukaryotic Viruses in iBD Cohorts 

We also took advantage of the available data from the three IBD 
cohorts to analyze sequences from eukaryotic viruses. Anellovi- 
rus sequences were more prevalent in IBD samples compared 
to healthy controls (Table S4; anellovirus positive: household 
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Figure 6. Validation of Bacterial Dysbiosis in Two Additional Cohorts 

(A and B) Faith’s phylogenetic alpha diversity (left) and bacterial species richness (right) based on 16S rRNA gene sequences in the stool of (A) Chicago healthy 
controls, CD, and UC patients and (B) Boston healthy controls, CD, and DC patients. Error bars indicate the mean ± SD. Statistical significance was determined 
by the Kruskal-Wallis test with Dunn’s correction comparing all samples to all samples. *p > 0.05 and **p > 0.01 . 

(C) Plot of alpha diversity normalized to the average diversity in healthy subjects (top) and relative bacterial family abundance (bottom) for the Chicago and Boston 
cohorts. Healthy control, UC, and CD samples are indicated. Longitudinal samples from the same subject are grouped together. 

See also Tables S5, S7, and S8. 



controls, 0%; healthy controls, 4.7%; CD, 27%; UC, 29%). How- 
ever, anellovirus sequences from fecal samples were not de- 
tected in all patients and did not correlate with disease activity 
or drug treatment. 

DISCUSSION 

In this paper, we demonstrate that disease-specific changes in 
the enteric virome occur in both major forms of IBD, Crohn’s dis- 



ease, and ulcerative colitis. This was observed in a cohort of pa- 
tients in comparison to household controls with increased power 
to detect disease-associated changes in the metagenome and 
validated in two independent and geographically distinct cohorts 
containing matched controls. The primary change in the virome 
associated with IBD was a significant expansion of the taxo- 
nomic richness of Caudovirales bacteriophages. Importantly, 
although this change was observed in both CD and UC, the vi- 
ruses responsible for the change appeared to differ between 
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Figure 7. Validation of Caudovirales Expansion in Two Additional Clinical Cohorts 

(A and B) Rarefaction curves of Caudovirales richness versus an increasing number of subsamplings with replacement for the Chicago (top) and Boston (bottom) 
cohorts. 

(A) Caudovirales richness based on individual sequences. 

(B) Richness based on Caudovirales contigs. 

(C) Venn diagram of the Caudovirales taxa in healthy control, CD, and UC samples from Chicago (top) and Boston (bottom), n indicates the number of samples 
within each subgroup. 

See also Tables S4, S5, and S6. 



the two diseases, suggesting that the virome is specific for CD 
versus UC. Comparison across cohorts reveaied that enteric vi- 
romes were unique between individuais and between cohorts, 
which is consistent with previous reports on the incredibie diver- 
sity of the human gut virome (Minot et al., 2011; Reyes et ai., 
201 0). Aithough the variabiiity of gut bacteria and viruses in non- 
househoid controis and iBD patients makes it more difficuit 
to observe specific reiationships between individuai bacterio- 
phages and individuai bacterial taxa, an increase in bacterio- 
phage richness was consistently associated with IBD despite a 
decrease in bacterial richness and diversity. 

Our data are consistent with reports that detected more 
Caudovirales bacteriophage sequences in intestinal washes 
and biopsy tissues of pediatric CD patients compared to nonin- 
flammatory controls (Wagner et al., 2013) and enumerated more 
Caudovirales virions in CD biopsy washes by microscopy (Lep- 
age et al., 2008). We believe that our observations reflect an 
expansion of infectious bacteriophages in IBD for two reasons. 
First, we sequenced VLP preparations enriched for virions. Sec- 
ond, an expansion of temperate bacteriophages integrated in 
bacterial genomes would be predicted to positively correlate 
with their bacterial hosts while we observe an inverse correlation. 
However, our data do not rule out the possibility that bacterial 
species harboring specific prophages are also expanded. Anal- 
ysis of this will require full sequencing of the bacterial micro- 
biome to complement analysis of bacterial taxa via analysis of 
1 6S ribosomal RNA sequences. 

Potential Role of Bacteriophages in IBD 

In the human gut and in many ecosystems, the predominant 
bacteriophages are tailed, dsDNA Caudovirales and non-tailed. 



ssDNA Microviridae (Breitbart et al., 2003; Reyes et al., 2010). 
The biology of bacteriophages has been extensively reviewed 
(Brussow et al., 2004; Clokie et al., 201 1 ; Fortier and Sekulovic, 
2013). The expansion in Caudovirales bacteriophage richness 
observed here could arise from the induction of prophage from 
commensal microbes or reflect the introduction of new viruses 
acquired from the environment, for example from food or contact 
with other people, including household contacts. These changes 
might have significant consequences for the bacterial micro- 
biome. For example, bacteriophages are primary drivers of bac- 
terial diversity and fitness in different ecosystems (Brussow et al., 
2004). In the gut, bacteriophages are responsible for the horizon- 
tal transfer of genetic material among bacterial communities, 
including those for pathogenesis (e.g., cholera toxin, pertussis 
toxin, and shiga toxin) and antibiotic resistance (Brussow et al., 
2004; Maiques et al., 2006; Zhang and LeJeune, 2008). Wide- 
spread bacteriophage induction, mutation, or introduction from 
external sources could effectively shuffle the deck of bacterial 
fitness and resistance genes. Second, the activation of latent pro- 
phages leads to the lysis of their bacterial hosts and can have pro- 
found ecological consequences (Weitz and Wilhelm, 2012). The 
intestinal microbiome has been shown to be sensitive to bacterio- 
phage invasion, leading to changes in the abundance of specific 
gut bacterial species (Reyes et al., 2013). Lysis of bacteria would 
also be expected to release proteins, lipids, and nucleic acids 
that serve as pathogen-associated molecular patterns (PAMPs) 
and antigens that trigger inflammatory signaling cascades 
to induce cytokines, cellular infiltration, and tissue damage. The 
development of animal models to test these predator-prey rela- 
tionship(s) in IBD pathogenesis and intestinal inflammation will 
certainly be an important area for future investigation. 
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Another potential consequence of a change in enteric bacte- 
riophages might result from direct interactions between these vi- 
ruses and the mammalian host. For example, bacteriophages 
are able to translocate from the gastrointestinal (Gl) lumen to 
systemic sites in animals (Gorski et al., 2006), CD patients, and 
healthy controls (Parent and Wilson, 1971). Bacteriophages are 
also capable of inducing humoral immune responses (Uhr 
et al., 1962). Further, in vitro stimulation of macrophages with 
bacteriophage particles induces MyD88-dependent proinflam- 
matory cytokine production (Eriksson et al., 2009). Chronic intes- 
tinal inflammation is the most basic element of IBD pathology, 
leading to the destruction of intestinal tissue and increased 
epithelial permeability. This leads to increased systemic expo- 
sure to the flow of microbial immunogens, potentially including 
those from bacteriophages and lysed bacterial cells, to maintain 
and further exacerbate inflammation. For these reasons, bacte- 
riophages may serve as antigens or innate immune ligands that 
stimulate host immunity and inflammation. 

The “Dark Matter” of the Virome 

The metagenomic sequencing of samples enriched for intact vi- 
rions has led to an appreciation of the incredible richness of 
enteric viruses in humans. Importantly, the methods we used 
here would not be expected to detect either enveloped viruses 
or RNA viruses and so there may be much to learn about these 
aspects of the IBD virome. Prior attempts to characterize the vi- 
rome using these methods were only able to assign 60% to 
87% of VLP sequences or contigs to anything within sequence 
databases (Minot et al., 2011, 2013). Across our cohorts, we 
were only able to assign on average 14% of VLP sequences 
to a viral database. The apparent discrepancy between our 
study and previous ones may be explained by our use of higher 
throughput lllumina-based sequencing, sequence databases, 
or taxonomic assignment criteria. The percent identity of those 
VLP sequences that were assigned in this study varied greatly in 
a subset of sequences that were analyzed (40% to 100% iden- 
tity; data not shown). This is a major issue for future studies, as 
we were only able to report on the bacteriophages that are most 
closely related to taxa in the database. It is likely that additional 
viruses are present in our sequencing data sets but are not de- 
tected due to this limitation. An implication of this is that current 
databases lack sufficient depth for us to be able to link specific 
bacteriophages to individual bacteria or disease. This limitation 
will only be overcome by significant expansion of the bacterio- 
phage sequences and annotations in available databases. This 
will not be a simple task and will require a global, coordinated 
effort to improve virus databases; it is clear that the Human 
Microbiome Project (http://www.hmpdacc.org/), an effort of 
similar scale, has improved the ability to classify and study bac- 
teria. As databases improve, the resolution of our picture of 
bacterial microbiome-virome associations will improve. Many 
of the sequences that we did not assign using our viral database 
map to bacterial genomes (data not shown), an assignment 
complicated by the fact that many bacterial genome sequences 
contain prophages that have not been independently anno- 
tated. Addressing the misannotation of integrated prophages 
as “bacterial” sequences will require development of new tools 
and significant improvement of viral databases containing 



a much greater diversity of fully annotated complete viral 
genomes. 

Virome Implications for Disease Pathogenesis, 
Treatment, and Monitoring 

The primary therapeutic goal in IBD is to limit inflammation and 
halt or even reverse tissue destruction. In many cases, pharma- 
cologic or biomolecule therapies fail, and surgical resection of in- 
flamed/damaged tissue is required. Thus, novel approaches are 
required to optimize IBD management. One potential approach 
is the manipulation of enteric bacteria through probiotics and 
prebiotics, which have had limited success in humans thus far 
(Butterworth et al., 2008; Naidoo et al., 2011; Shanahan and 
Quigley, 2014). Fecal transplantation from healthy donors or 
with defined bacterial cultures is an approach that is gaining 
traction due to its success in the treatment of recurring Clos- 
tridium difficile infections (van Nood et al., 2013). However, early 
attempts at curing bacterial dysbiosis in UC through fecal trans- 
plantation have not proven successful (Angelberger et al., 2013; 
Kump et al., 2013), perhaps due to the instability of the donor 
microbiome in IBD patients. It is intriguing to speculate that a dis- 
ease-associated, taxonomically rich virome in the recipient may 
interact with donor microbes to limit probiotic or fecal transplan- 
tation efficacy. Defining the virome before and during probiotic 
or fecal transplantation will be required to assess this possibility. 
It is notable that, in animal models, eukaryotic viruses change 
intestinal biology and inflammation by acting in concert with 
host bacteria in a manner dependent on host genetics (Basic 
et al., 2014; Cadwell et al., 2010). Thus both eukaryotic viruses 
and bacteriophages may have a role in IBD through interactions 
with the bacterial microbiome. It will be important to more 
completely understand the interactions between bacteria and 
viruses and viruses and the host to be able to develop and 
personalize these approaches to managing IBD. 

Our data also suggest that the specific expansion of bacterio- 
phages in OD is associated with decreased bacterial diversity. 
The development of methods that block the infection of their 
bacterial hosts by these bacteriophages is worth investigation. 
Furthermore, the identification of disease-specific Caudovirales 
could prove useful in differentiating OD and UO in the ~10% of 
cases of IBD in which the clinical phenotype is indeterminate be- 
tween the two. Here, we did not observe any significant changes 
in the virome in IBD patients over time as disease activity 
changed. However, larger cohorts with more frequent repeated 
sampling of both IBD patients and household controls are 
required to more fully assess enteric virome stability in IBD. 

Although IBD is associated with bacterial dysbiosis, addi- 
tional— and much more common— diseases are also associated 
with changes in the bacterial microbiome. These include dia- 
betes, obesity, metabolic diseases, and cancer (Larsen et al., 
2010; Ley et al., 2005; Nicholson et al., 2012; Qin et al., 2012; 
Shen et al., 2010; Turnbaugh et al., 2006, 2009; Zackular et al., 
2013). Our data indicate that understanding the bacterial micro- 
biome in these diseases likely requires concurrent analysis of the 
virome. More speculatively, we question whether bacterial mi- 
crobiome changes in many diseases are secondary to changes 
in emergence of temperate bacteriophages or introduction of 
bacteriophages from food, the environment, or through human 
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or animal contact. Thus, data presented here identifying inverse 
relationships between the bacterial microbiome and the enteric 
virome open up a new area of research in IBD and perhaps other 
diseases that have been shown to be associated with changes in 
the bacterial microbiome. 

EXPERIMENTAL PROCEDURES 
Cohort Description 

Stool samples were collected at four gastroenterology clinics: (1) Adden- 
brooke’s Hospital, University of Cambridge, UK; (2) Cedars-Sinai Hospital, 
Los Angeles, USA; (3) Rush University Medical Center, Chicago, USA; and 
(4) Massachusetts General Hospital, Boston, USA. A detailed list of the sub- 
jects and a full description of each cohort are included in the Supplemental 
Information. 

Virus-like Particle Enrichment and Sequencing 
Virus-like Particle Preparation 

VLPs were enriched from pulverized human stool using a protocol based on 
previously described methods (Reyes et al., 2010; Thurber et al., 2009). Stool 
sample filtrates were treated with lysozyme and chloroform to degrade any 
unfiltered bacterial and host cell membranes. Nonvirus protected DNA was 
degraded by treating with a DNase cocktail followed by heat inactivation of 
DNases. VLPs were lysed and nucleic acid was extracted. 

VLP DNA was amplified, and DNA was randomly fragmented by ultrasonica- 
tion before lllumina library construction. An equimolar pool of 1 2 samples was 
sequenced on an lllumina MiSeq instrument. 

Sequence Processing and Analysis 

Adapters and low-quality bases were trimmed, and reads were clustered at 
95% identity. Unique sequences were queried against a customized viral data- 
base using BLASTx. Reads were assigned taxonomy using the lowest-com- 
mon ancestor algorithm as implemented in MEGAN (v5.1.5) (Huson et al., 

2011) . Absolute read counts for selected viral taxa were exported from 
MEGAN and imported into R, data were normalized, and richness and diversity 
were calculated. 

De Novo Contig Assembly 

Contigs were assembled using the IDBA_UD assembler (v 1.1.0) using mini- 
mum and maximum kmer lengths of 20 and 120, respectively (Peng et al., 

2012) . All contigs larger than 1,000 nucleotides were compared to a viral 
genome reference sequence database consisting of 5,500 viral genomes 
available in NCBI as of July 7, 2014, using dc_megablast. 

Bacterial 16S rRNA Analysis 
16S rRNA Gene Analysis 

Stool total nucleic acid was extracted from aliquots of pulverized human stool, 
as previously described (Reyes et al., 201 3) with minor modifications for lower 
throughput processing of human stool (Supplemental Information). Primer 
selection and PCR was performed following previously described methods 
(Caporaso et al., 2011). The final pooled samples were sequenced on the lllu- 
mina MiSeq platform. 16S analysis was done in QIIME (Quantitative Insights 
into Microbial Ecology, version 1 .8.0) (Caporaso et al., 2010), and OTU relative 
abundance, diversity and richness plots were generated in GraphPad Prism 
(version 6.0d). 
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The MG-RAST project ID for the Roche 454 sequences reported in Figure 1 of 
this paper is 11446. The EMBL-EB! accession number for the VLP and 16S 
sequences reported in this paper is PRJEB7772. 
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SUMMARY 

Viable yet damaged cells can accumulate during 
development and aging. Although eliminating those 
cells may benefit organ function, identification of 
this less fit cell population remains challenging. Pre- 
viously, we identified a molecular mechanism, based 
on “fitness fingerprints” displayed on cell mem- 
branes, which allows direct fitness comparison 
among cells in Drosophila. Here, we study the phys- 
iological consequences of efficient cell selection for 
the whole organism. We find that fitness-based cell 
culling is naturally used to maintain tissue health, 
delay aging, and extend lifespan in Drosophila. We 
identify a gene, azot, which ensures the elimination 
of less fit cells. Lack oiazot increases morphological 
malformations and susceptibility to random muta- 
tions and accelerates tissue degeneration. On the 
contrary, improving the efficiency of cell selection is 
beneficial for tissue health and extends lifespan. 

INTRODUCTION 

Individual cells can suffer insults that affect their normal func- 
tioning, a situation often aggravated by exposure to external 
damaging agents. A fraction of damaged cells will critically 
lose their ability to live, but a different subset of cells may be 
more difficult to identify and eliminate: viable but suboptimal 
cells that, if unnoticed, may adversely affect the whole organism 
(Moskalev et al., 2013). 

What is the evidence that viable but damaged cells accumu- 
late within tissues? The somatic mutation theory of aging (Ken- 
nedy et al., 2012; Szilard, 1959) proposes that over time cells 
suffer insults that affect their fitness, for example, diminishing 
their proliferation and growth rates, or forming deficient struc- 
tures and connections. This creates increasingly heteroge- 
neous and dysfunctional cell populations disturbing tissue 
and organ function (Moskalev et al., 2013). Once organ function 
falls below a critical threshold, the individual dies. The theory is 
supported by the experimental finding that clonal mosaicism 
occurs at unexpectedly high frequency in human tissues as a 
function of time, not only in adults due to aging (Jacobs 
et al., 2012; Laurie et al., 2012), but also in human embryos 
(Vanneste et al., 2009). 

CrossMark 



Does the high prevalence of mosaicism in our tissues mean 
that it is impossible to recognize and eliminate cells with subtle 
mutations and that suboptimal cells are bound to accumulate 
within organs? Or, on the contrary, can animal bodies identify 
and get rid of unfit viable cells? 

One indirect mode through which suboptimal cells could be 
eliminated is proposed by the “trophic theory” (Levi-Montalcini, 
1987; Moreno, 2014; Raff, 1992; Simi and Ibanez, 2010), which 
suggested that Darwinian-like competition among cells for 
limiting amounts of survival-promoting factors will lead to 
removal of less fit cells. However, it is apparent from recent 
work that trophic theories are not sufficient to explain fitness- 
based cell selection, because there are direct mechanisms 
that allow cells to exchange “cell-fitness” information at the local 
multicellular level (Moreno and Rhiner, 2014). 

In Drosophila, cells can compare their fitness using different 
isoforms of the transmembrane protein Flower. The “fitness 
fingerprints” are therefore defined as combinations of Flower 
isoforms present at the cell membrane that reveal optimal or 
reduced fitness (Merino et al., 2013; Rhiner et al., 2010). The iso- 
forms that indicate reduced fitness have been called Flower'-^® 
isoforms, because they are expressed in cells marked to be elim- 
inated by apoptosis called “Loser cells” (Rhiner et al., 2010). 
However, the presence of Flower'-®®® isoforms at the cell mem- 
brane of a particular cell does not imply that the cell will be culled, 
because at least two other parameters are taken into account: (1 ) 
the levels of Flower'-®®® isoforms in neighboring cells: if neigh- 
boring cells have similar levels of Lose isoforms, no cell will be 
killed (Merino et al., 2013; Rhiner et al., 2010); (2) the levels of a 
secreted protein called Sparc, the homolog of the Sparc/Osteo- 
nectin protein family, which counteracts the action of the Lose 
isoforms (Portela et al., 2010). 

Remarkably, the levels of Flower isoforms and Sparc can be 
altered by various insults in several cell types, including: (1) the 
appearance of slowly proliferating cells due to partial loss of ribo- 
somal proteins, a phenomenon known as cell competition (Bail- 
Ion and Easier, 2014; de Beco et al., 2012; Hogan et al., 2011; 
Morata and Ripoll, 1975; Moreno et al., 2002; Tamori and 
Deng, 2011); (2) the interaction between cells with slightly higher 
levels of d-Myc and normal cells, a process termed supercom- 
petition (de la Cova et al., 2004; Moreno and Easier, 2004); (3) 
mutations in signal transduction pathways like Dpp signaling 
(Portela et al., 2010; Rhiner et al., 2010); or (4) viable neurons 
forming part of incomplete ommatidia (Merino et al., 2013). 
Intriguingly, the role of Flower isoforms is cell type specific, 
because certain isoforms acting as Lose marks in epithelial cells 
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(Rhiner et al., 2010) are part of the fitness fingerprint of heaithy 
neurons (Merino et ai., 2013). Therefore, an exciting picture 
starts to appear, in which varying ieveis of Sparc and different 
isoforms of Fiower are produced by many celi types, acting as 
direct molecuiar determinants of celi fitness. 

Here, we aimed to ciarify how celis integrate fitness informa- 
tion in order to identify and eiiminate suboptimal ceils. Subse- 
quently, we analyzed what are the physiological consequences 
of efficient cell selection for the whole organism. 

RESULTS 

Azot Is Expressed in Cells Undergoing Negative 
Selection 

In order to discover the molecular mechanisms underlying cell 
selection in Drosophila, we analyzed genes transcriptionally 
induced using an assay where WT cells {tub>Gal4) are outcom- 
peted by dMyc-overexpressing supercompetitors [tub>dmyc) 
(Figure 1 D) due to the increased fitness of these dMyc-overex- 
pressing cells (Rhiner et al., 2010). The expression of CG11165 
(Figure SI A available online) was strongly induced 24 hr (hr) after 
the peak of flower and spare expression (Figure SI B). In situ hy- 
bridization revealed that CG11165 mRNA was specifically de- 
tected in Loser cells that were going to be eliminated from 
wing imaginal discs due to cell competition (Figure SIC). The 
gene, which we named ahuizotl {azot) after a multihanded Aztec 
creature selectively targeting fishing boats to protect lakes 
(Reeves, 2006), consists of one exon, azot’s single exon encodes 
for a four EF-hand-containing cytoplasmic protein of the canon- 
ical family (Figures SI D and SI E) that is conserved, but unchar- 
acterized, in multicellular animals (Figure SI A). 

To monitor Azot expression, we designed a translational re- 
porter resulting in the expression of Azot: :dsRed under the control 
of the endogenous azot promoter in transgenic flies (Figure 1A). 
Azot expression was not detectable in most wing imaginal discs 
under physiological conditions in the absence of competition 
(Figures B and 1 C). We next generated mosaic tissue of two 
clonal populations, which are known to trigger competitive 
interactions resulting in elimination of otherwise viable cells. Cells 
with lower fitness were created by confronting WT cells with 
dMyc-overexpressing cells (Figures 1 E-1 H) (Moreno and Basler, 
2004), by downregulating Dpp signaling (Moreno et al., 2002) (Fig- 
ures 1 1-1 K), by overexpressing Flower'-^® isoforms (Rhiner et al., 
201 0) (Figures 1 L and 1 M), in cells with reduced Wg signaling (Fig- 
ure S1F) (Vincent et al., 2011), by suppressing Jak-Stat signaling 
(Rodrigues et al., 2012) in subgroups of cells (Figure S1G) or by 
generating Minute clones (Lolo et al., 2012; Morata and Ripoll, 



1975; Simpson, 1979) (Figure S1H). Azot expression was not 
detectable in nonmosaic tissue of identical genotype (Figures 
1 N-1 P; Figures SI I and SI J), nor in control clones overexpress- 
ing UASIacZ (Figure SI K). On the contrary, Azot was specifically 
activated in all tested scenarios of cell competition, specifically in 
the cells undergoing negative selection (“Loser cells”) (Figures 
1 D-1 M). Azot expression was not repressed by the caspase in- 
hibitor protein P35 (Figures 1G and 1 H). 

Because Flower proteins are conserved in mammals (Petrova 
et al., 2012), we decided to test if they are also able to regulate 
azot. Mouse Flower isoform 3 (mFlower®) has been shown to 
act as a “classical” Lose isoform, driving cell elimination when 
expressed in scattered groups of cells (Petrova et al., 2012), a 
situation where azot was induced in Loser cells (Figures IQ 
and 1R) but is not inducing cell selection when expressed ubiq- 
uitously a scenario where azot was not expressed (Figures 1 S 
and IT). This shows that the mouse Flower'-®®® isoforms function 
in Drosophila similarly to their fly homologs. 

Interestingly, azot is not a general apoptosis-activated gene 
because its expression is not induced upon eiger, hid, or bax 
activation, which trigger cell death (Fuchs and Steller, 2011; 
Gaumer et al., 2000) (Figures SI L-S1 N). Azot was also not ex- 
pressed during elimination of cells with defects in apicobasal po- 
larity (Figure S10) or undergoing epithelial exclusion-mediated 
apoptosis {dCsk) (Figures SI P and S1Q) (Vidal et al., 2006). 

Next, we analyzed if azof is expressed during the elimination of 
peripheral photoreceptors in the pupal retina, a process medi- 
ated by Flower-encoded fitness fingerprints (Merino et al., 
201 3). Thirty-six to 38 hr after pupal formation (APF), when Flow- 
g^Lose-B expression begins in peripheral neurons (Merino et al., 
2013), we could not detect Azot expression in the peripheral 
edge (Figures S1R-S1U). At later time points (40 and 44 hr 
APF), Azot expression is visible and restricted to the peripheral 
edge where photoreceptor neurons are eliminated (Figures 1 U 
and IV). This expression was confirmed with another reporter 
line, azot{KO; gfp}, where gfp was directly inserted at the azot lo- 
cus using genomic engineering techniques (Huang et al., 2009) 
(Figures 1W-1Y). 

From these results, we conclude that Azot expression is acti- 
vated in several contexts where suboptimal and viable cells are 
normally recognized and eliminated. 

Azot Is Required to Eliminate Loser Cells and Unwanted 
Neurons 

To understand Azot function in cell elimination, we generated 
azot knockout (KO) flies, whereby the entire azot gene was 
deleted (Figure 1W). Next, we analyzed Azot function using 



Figure 1. Azot Is Expressed during Cell Selection of Viable Unfit Cells 

(A-M) Expression analysis of Azot during different types of cell competition. For all pictures, Azot::dsRed reporter (A) is in red, and merges show outcompeted 
clones (green, marked with GFP) of several genotypes. DAPI is in blue. The following genotypes were analyzed: (B and C) azotdsRed and (D-F) tub>dmyc 
background (black) and WT cells marked with GFP (green). Clones were generated as shown in (D) and analyzed 48 hr ACI. (G and H) tub>dmyc background 
(black) and WT cells marked with GFP (green) expressing in addition to the P35 caspase inhibitor (UASp35). Forty-eight hourr ACI. (I-M) Flip-out clones (green) 
generated as shown in (I) and overexpressing brinker (UASbrinker) (J and K), fwe‘'°“-® (UASfwe^°’^^^) (L and M), ormftve® (UASmfwe^). 

(Q and R) Twenty-four hour ACI. 

(N-P, S, and T) General overexpression of UASfwe'~°^^^ and UASmfwe^ using the actin promoter as shown in (N). 

(U-Y) Pupal retinas at different developmental time points. (U and V) Expression analysis of Azot (red), using Azot::dsRed, in peripheral photoreceptors at 40 hr 
after pupa formation (APF) (U and V). (W) Genomic engineering strategy used for the generation of azot knockout (KO) flies. (X and Y) GFP expression (green) 
driven by the azot promoter in azotfKO; gfp}, 44 hr APF, DAPI (blue, Y). 
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c/myc-induced competition, in the absence of Azof function, 
ioser ceiis were no ionger eliminated (Figures 2A-2F), showing 
a dramatic 100-foid increase in the number of surviving ciones 
(Figures 2B and 2E). Loser ceils occupied more than 20% of 
the tissue 72 hr after cione induction (ACI) (Figures 2B and 2F). 
Moreover, using azot{KO; gfp} homozygous fiies (that express 
GFP under the azot promoter but lack Azof protein), we found 
that ioser ceils survived and showed accumuiation of GFP 
(Figures S2A and S2B). From these resuits, we conciude that 
azot is expressed by loser cells and is essentiai for their 
eiimination. 

In addition, cione removai was deiayed in an azot heterozy- 
gous background (50-foid increase, 15%) (Figures 2E and 2F), 
compared to controi fiies with normai ievels of Azot (1-foid, 
1 %) (Figures 2A, 2E, and 2F). Ceii eiimination capacity was fuily 
restored by crossing two copies of Azot::dsRed into the azoV'^ 
background (0.5-foid, 0.2%) demonstrating the functionaiity of 
the fusion protein (Figures 2C, 2E, and 2F). Siiencing azot with 
two different RNAis was simiiariy abie to hait seiection during 
dmyc-induced competition (Figures S2C-S2P). Next, in order 
to determine the roie of Azof’s EF hands, we generated and over- 
expressed a mutated isoform of Azot (Pm4Q12) carrying, in each 
EF hand, a point mutation known to aboiish Ca^’’’ binding (Maune 
et ai., 1992). Aithough overexpression of wiid-type azot in nega- 
tiveiy seiected ceiis did not rescue the eiimination (Figures S2E, 
S2I, S2L, and S2P), overexpression of the mutant AzotPm4Q12 
reduced celi seiection (Figures S2FI, S2i, S20, and S2P), func- 
tioning as a dominant-negative mutant. This shows that Ca^’’’ 
binding is important for Azot function. Finaiiy, staining for 
apoptotic ceiis corroborated that the iack of Azot prevents ceii 
eiimination, because ceii death was reduced 8-foid in mosaic 
epitheiia containing ioser ceiis (Figure 2D). 

Next, we anaiyzed the roie of azot in eiimination of peripherai 
photoreceptor neurons in the pupai retina using homozygous 
azot KO fiies (Figures 2G-2L). Pupai retinas undergoing photore- 
ceptor cuiling (44 hr APF) of azof’"^'’' and azov'^ fiies were stained 
for the ceii death marker TUNEL (Figures 2G and 2i) and the pro- 
apoptotic factor Flid (Figures 2H and 2J). Consistent with the 
expression pattern of Azot, the number of Flid and TUNEL-pos- 
itive ceiis was dramaticaliy decreased in azoV'^ retinas (Figures 
2I-2L) compared to azot*'* retinas (Figures 2G, 2fH, 2K, and 2L). 

Those resuits showed that Azot was required to induce ceii 
death and Flid expression during neuronai cuiiing. Therefore, 
we tested if that was aiso the case in the wing epitheiia during 
dmyc-induced competition. We found that Flid was expressed 



in ioser ceiis and that the expression was strongiy reduced in 
the absence of Azot function (Figures 2M-2Q). 

Finaiiy, forced overexpression of Fiower'~°'*® isoforms from 
Drosophila (Figures S2Q, S2R, and S2T) and mice (Figures 2R- 
2T; Figures S2S and S2U) were unabie to mediate WT ceii eiim- 
ination when Azot function was impaired by mutation or silenced 
by RNAi. 

These results suggested that azof function was dose sensitive, 
because heterozygous azot mutant fiies dispiayed deiayed eiim- 
ination of ioser ceiis when compared with azot WT fiies (Fig- 
ure 2E). We therefore took advantage of our functionai reporter 
Azot::dsRed (Figures 2C and 2E) to test whether ceii elimination 
could be enhanced by increasing the number of genomic copies 
of azof. We found that tissues with three functionai copies of azot 
were more efficient eiiminating ioser ceiis during dmyc-induced 
competition and most of the ciones were cuiied 48 hr ACI (Fig- 
ures 2U-2W). 

From these results, we conclude that azot expression is 
required for the elimination of Loser ceils and unwanted neurons 
(Figure 2X). 

Azot Maintains Tissue Fitness during Development 

Next, we asked what couid be the consequences of decreased 
ceii selection at the tissue and organismal ievei. To this 
end, we took advantage of the viability of homozygous azot KO 
fiies. We observed an increase of severai deveiopmentai aberra- 
tions. We focused on the wings, where ceii competition is best 
studied and, because aberrations were easy to define, which 
comprised meianotic areas, biisters, and wing margin nicks (Fig- 
ures 3A-3E). Wing defects of azof mutant fiies couid be rescued 
by introducing two copies at azotr.dsRed, showing that the phe- 
notypes are specificaiiy caused by ioss of Azot function (Figures 
3A-3E). 

Next, we reasoned that miid tissue stress shouid increase the 
need for fitness-based ceii seiection after damage. First, in order 
to generate muiticeliuiar tissues scattered with suboptimai ceiis, 
we exposed larvae to UV light (Figure 3F) and monitored Azot 
expression in wing discs of UV-irradiated WT iarvae, which 
were stained for cieaved caspase-3, 24 hr after treatment (Fig- 
ures 3G-3K). Under such conditions, Azot was found to be ex- 
pressed in cleaved caspase-3-positive cells (Figures 3FI-3K). 
Ali Azot-positive ceiis showed caspase activation and 17% of 
cieaved caspase-positive ceiis expressed Azot (Figure 3G). 
This suggested that Azot-expressing ceils are culled from the tis- 
sue. To confirm this, we iooked at iater time points (3 days after 



Figure 2. Azot Is Required to Eliminate Loser Cells and Unwanted Neurons 

(A-F) Analysis of azot KO during dmyc-induced supercompetition 72 hr ACI. (D) Quantification of cleaved caspase-3 and GFP-positive cells during dmyc-induced 
supercompetition in azot*'* and azoV'^ backgrounds (p < 0.01 ) 72 hr ACI. (E) Quantification of number of clones; the following backgrounds were analyzed: (A 
and E) azof’"^'’', (E) azot*'^ (p < 0.01), (B and E) azoV'^ (p < 0.01), and (C and E) azoV'^;*'* (p > 0.05). (E) Percentage of the wing pouch occupied by the wt cells in 
the (A and E) azot*'*, (F) azot*'^, (B and F) azoV'^, (C and F) azoV'^;*'*. 

(G-L) Role of azot during neuronal culling in the pupal retina. (K and L) Quantification of the number of apoptotic (TUNEL-positive, magenta) or Hid-expressing 
(red) peripheral photoreceptors, in azot*'* (G, H, K, and L) and azoV'^ (p < 0.01) (I, J, K, and L) flies. DAPI is in blue. 

(M-Q) Flid expression (red) in loser clones (green) during supercompetition 48 hr ACI in azot*'* (M, N, and Q) and azoV'^ (Q-Q) backgrounds. 

(R-T) Seventy-two hour ACI mfwe^-overexpressing clones (UASmfwe") in azof*'* (R and T) and azot*'^ (S and T) backgrounds (p < 0.01). 

(U-W) Analysis of an extra genomic copy of azot during dmyc-induced supercompetition. (U) Quantification of the number of clones during dmyc-induced 
supercompetition with or without an extra genomic copy of azot. (V and W) Discs analyzed 48 hr ACI in azot*'* ly) and azot*'*; azot* (p < 0.01) (W). 

(X) Azot expression is required for cell-competition-mediated apoptosis of loser cells. Data are represented as mean ± SEM. 
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irradiation; Figure S3A) and found that the increase in Azot-pos- 
itive celis was no ionger detectabie (Figures S3B-S3D). The eiim- 
ination of azof-expressing celis after UV irradiation required azot 
function, because cells revealed by reporter azot{KO; gfp}, that 
express GFP instead of Azot, persisted in wing imaginal discs 
from azof-null larvae (Figures S3E-S3G). We therefore tested if 
lack of azot leads to a faster accumulation of tissue defects dur- 
ing organ development upon external damage. We irradiated 
azot^'^ pupae 0 stage (Figures 3L-3P) and compared the num- 
ber of morphological defects in adult wings to those in nonirradi- 
ated azot KO flies (Figures 3A-3E). We found that aberrations 
increased more than 2-fold when compared to nonirradiated 
azot^'^ flies (Figures 3L-3P). 

In order to functionally discriminate whether azot belongs 
to genes regulating apoptosis in general or is dedicated to 
fitness-based cell selection, we examined if azot silencing 
prevented Eiger/TNF-induced cell death {GMR-Gal4,UASeiger) 
(Figures S3H-S3N). Inhibiting apoptosis {UASp35) or eiger 
{UASRNAieiger) rescued eye ablation, whereas azot silencing 
and overexpression of AzotPm4Q12 did not (Figures S3I-S3N). 
Furthermore, azot silencing did not impair apoptosis during geni- 
talia rotation (Figures S30-S3R) (Suzanne et al., 2010) or cell 
death of epithelial precursors in the retina (Figures S3S-S3V) 
(Wolff and Ready, 1991). 

The results showed above highlight the consequences of 
nonfunctional cell-quality control within developing tissues 
(Figure 3Q). 

azot Promoter Computes Relative Flower''°^^ and Sparc 
Levels 

Next, we performed epistasis analyses to understand at which 
level azot is transcriptionally regulated. For this purpose, we 
again used the assay where WT cells are outcompeted by 
dMyc-overexpressing supercompetitors (Figure 1 D). We have 
previously observed that azot induction is triggered upstream 
of caspase-3 activation and accumulated in outcompeted cells 
unable to die (Figures 1G and 1H). Then, we genetically modi- 
fied upstream events of cell selection (Figures 4A-4G): silencing 
fwe'-“® transcripts by RNAi or overexpressing Sparc, both 
blocked the induction of Azot::dsRed in WT loser cells (Figures 
4A-4D and 4G). In contrast, when outcompeted WT cells were 
additionally “weakened” by Sparc downregulation using RNAi, 
Azot is detected in almost all loser cells (Figures 4E-4G) 
compared to its more limited induction in the presence of 
endogenous Sparc (Figures IE and IF and 4G). Inhibiting JNK 
signaling with UASpuc (Martin-Bianco et al., 1998; Moreno 
et al., 2002) did not suppress Azot expression (Figures S4A 
and S4B). 



Next, we analyzed the activation of Azot upon irradiation. 
Strikingly, we found that all Azot expression after irradiation 
was eliminated when Flower Lose was silenced and also when 
relative differences of Flower Lose where diminished by overex- 
pressing high levels of Lose isoforms ubiquitously (Figures 4H- 
4K; Figure S4C). On the contrary, Azot was not suppressed after 
irradiation by expressing the prosurvival factor Bcl-2 or a p53 
dominant negative (Brodsky et al., 2000; Gaumer et al., 2000) 
(Figures S4C-S4G). Those results show that Azot expression 
during competition and upon irradiation requires differences in 
Flower Lose relative levels. 

Finally, we analyzed the regulation of Azot expression in neu- 
rons. Silencing fwe transcripts by RNAi blocked the induction of 
Azot::dsRed in peripheral photoreceptors (Figures 4L and 4M; 
Figure S4H). Because Wingless signaling induces Flower'-°^®‘® 
expression in peripheral photoreceptors (Merino et al., 2013), 
we tested if overexpression of Daxin, a negative regulator of 
the pathway (Willert et al., 1999), affected Azot levels and found 
that it completely inhibited Azot expression (Figures S4FI-S4J). 
Similarly, overexpression of the cell competition inhibitor Sparc 
also fully blocked Azot endogenous expression in the retina (Fig- 
ures S4FI, S4K, and S4L). Finally, ectopic overexpression of 
Flower'-^®"® in scattered cells of the retina was sufficient to 
trigger ectopic Azot activation (Figures S4M-S40). Those results 
show that photoreceptor cells also can monitor the levels of 
Sparc and the relative levels of Flower'-®®®'® before triggering 
Azot expression (Figure S4P). 

The results described above suggest that the azot promoter 
integrates fitness information from neighboring cells, acting as 
a relative “cell-fitness checkpoint” (Figures 4N-4Q). 

Cell Selection Is Active during Adulthood 

To test if fitness-based cell selection is a mechanism active not 
only during development, but also during adult stages, we 
exposed WT adult flies to UV light and monitored Azot and 
Flower expression in adult tissues (Figures 5A-5T). UV irradiation 
of adult flies triggered cytoplasmic Azot expression in several 
adult tissues including the gut (Figures 5B-5E; Figures S5A 
and S5B) (Lemaitre and Miguel-Aliaga, 2013) and the adult brain 
(Figures 5F-5J) (Fernandez-Hernandez et al., 2013). Likewise, 
UV irradiation of adult flies triggered Flower Lose expression in 
the gut (Figures 5K-5N) and in the brain (Figures 50-5T). Irradi- 
ation-induced Azot expression was unaffected by Bcl-2 but was 
eliminated when Flower Lose was silenced or when relative dif- 
ferences of Flower Lose where diminished in the gut (Figures 
S5C-S5E) and in the adult brain (Figures S5F-S5H). This sug- 
gests that the process of cell selection is active throughout the 
life history of the animal. Further confirming this conclusion. 



Figure 3. Azot Mutants Show Developmental Aberrations 

(A-E) Wings of 10- to 13-day-old flies and quantification of developmental aberrations in the wing of each genotype, ** < 0.01. (A and B) azot*'*, (A and C) 
azoV'^iazot*'*, (A and D) azoV'^ and (A and E) azot*'*;azot*. 

(E-K) Azot and cleaved caspase-3 expression upon UV irradiation (2 x 10“^ J irradiation dose during second instar larvae, treatment as shown in F). (G) 
Quantification of the percentage of Azot and cleaved caspase-3-expressing cells after UV irradiation. (H) AzotzdsRed expression after UV irradiation (red), (I) 
cleaved caspase-3 (green) after UV irradiation, (J) merge, and (K) merge with DAPI (blue). 

(L-P) Quantification of developmental aberrations and images of wings from 10- to 13-day-old flies after UV treatment (2 x 10“^ J, pupae stage 0) of genotypes 
(L and M) azot*'*, (L and N) azot^'^;azot*'*, (L and O) azoV'^ , and (L and P) azot*'*;azot*. 

(Q) Scheme showing the requirement of azot function for preventing developmental aberrations. Data are represented as mean ± SEM. 
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Figure 4. The azot Promoter Computes Relative Flower*"®®® and Sparc Levels 

(A-F) Epistasis analysis of the following genotypes during dmyc-induced supercompetition. (A and B) UASRNAifwe‘°^^‘~^^ , (C and D) UASsparc, and (E and F) 
UASRNAisparc. Azot::dsRed is shown in red (A, C, and E) and merges with GFP in (B, D, and F). 

(G) Graph showing the probability of finding Azot expression in a GFP marked clone in several genotypes. 

(H-J) Azot::dsRed expression after UV irradiation (red) is suppressed when UASRNAifwe‘°^^‘~^^ (H and I) or UASfwe‘~°^^~^ and UASfwe‘~°^^''^ (J and K) are ex- 
pressed ubiquitously. Quantified in Figure S4C. 

(Land M) Epistasis analysis of Azot expression in the Drosophila retina. Pupal retinas dissected 44 hr APF ot GMR-Gal4; RNAifwe (GD). Azot expression shown in 
red (L) and merge with nuclear marker DAPI in blue (M). Quantified in Figure S4H. 

(N) Azot is not expressed in cells without Flower‘"°®® isoforms. 

(0-Q) Cells expressing Flower*"®®® but that are either surrounded by cells with equal or higher levels of Flower*"®®® (O) or express high levels of Sparc (P) also do not 
activate azot expression. Cells with higher relative levels of Lose and not enough Sparc induce the expression of azof and are eliminated (Q). 
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Figure 5. Expression of Flower Isoforms and Azof in Adult Flies with and without UV Irradiation 

(A-E) Expression analysis of Azof (red, B and D) in the midgut without (B and C) and with (D and E) UV-irradiation treatment (as shown in A); (C) and (E) show 
merges with DAPI. 

(E-J) Expression analysis of Azot using reporter line azof/XO/g/p^ in the adult brain without (G and H) and after (I and J) UV-irradiation treatment merges with DAPI 
in (H and J). 

(K-T) Expression anaiysis of Eiower Lose isoforms Lose A (green) and Lose B (red) (flower Lose-A-GFP, flower Lose-B-RFP). (K and M) In the midgut without (K 
and L) and with (M and N) UV-irradiation treatment. (L and N) merges with DAPI. Inset in (M) shows Pwe^°“ * and Pwe'"°“ ® expression at higher magnification. 
(0-T) Expression of Eiower Lose isoforms in the adult brain without (O-Q) and after (R-T) UV irradiation, merges with DAPI in (Q and T). 



Azof function was essential for survival after irradiation, because 
more than 99% of azot mutant adults died 6 days after irradia- 
tion, whereas only 62.4% of WT flies died after the same treat- 
ment (Figure SSI). The percentage of survival correlated with 
the dose of azot because adults with three functional copies of 
azot had higher median survival and maximum lifespan than 
WT flies, or null mutant flies rescued with two functional azot 
transgenes (Figure S5J). 



Those results show that in adult tissues external damage can 
induce cell-fitness deficits. 

Role of Cell Selection during Aging 

Lack of cell selection could affect the whole organism by two 
nonexclusive mechanisms. First, the failure to detect precancer- 
ous cells, which could lead to cancer formation and death of the 
individual. Second, the time-dependent accumulation of unfit 
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but viable cells could lead to accelerated tissue and organ 
decay. We therefore tested both hypotheses. 

It has been previously shown that cells with reduced levels for 
cell polarity genes like scr/b or d/g are eliminated but can give rise 
to tumors when surviving (Igaki et al., 2009; Paris! et al., 2014; 
Tamori et al., 2010). We therefore checked if azot functions as 
a tumor suppressing mechanism in those cells (Figures S6A- 
S6M). Elimination of dig and scrib mutant cells was not affected 
by RNAi against azot (Figures S6D-S6M) or when Azot function 
was impaired by mutation (Figures S6N-S6R), in agreement 
with the absence of azof induction in these mutant cells (Figures 
S10 and S6A-S6C). Flowever, azot RNAi or the same azot 
mutant background efficiently rescued the elimination of clones 
with reduced Wg signaling (Vincent et al., 2011) (Figures S6J- 
S6M, S6Q, and S6R). 

Moreover, the high number of suboptimal cells produced by 
UV treatment did not lead to tumoral growth in azof-null back- 
ground (Figures S3E-S3G). Thus, tumor suppression mecha- 
nisms are not impaired in azot mutant backgrounds, and tumors 
are not more likely to arise in azof-null mutants. 

Second, we tested whether the absence of azot accelerates 
tissue fitness decay in adult tissues. We focused on the adult 
brain, where neurodegenerative vacuoles develop over time 
and can be used as a marker of aging (Liu et al., 2012). We 
compared the number of vacuoles appearing in the brain of 
flies lacking azot (azof“^“), WT flies {azot*'*), flies with one extra 
genomic copy of the gene {azot*'*; azot*), and mutant flies 
rescued with two genomic copies of azot {azot^'^;azot*'*). 
For all the genotypes analyzed, we observed a progressive in- 
crease in the number and size of vacuoles in the brain over time 
(Figures 6A-6P; Figure SOS). Interestingly, azoV'^ brains 
showed higher number of vacuoles compared to control flies 
{azot*'* and azoV'^;azot*'*) and a higher rate of vacuole accu- 
mulation developing over time (Figures 6N-6P). In the case of 
flies with three genomic copies of the gene {azot*'*; azot*), 
vacuole number tended to be the lowest (Figures 6E, 61, and 
6M-6P). 

Next, we analyzed the cumulative expression of azot during 
aging of the adult brain. We detected positive cells as revealed 
by reporter azot{KO; gfp}, in homozygosis, that express GFP 
instead of Azot. We observed a time-dependent accumulation 
of azof-positive cells (Figures 6Q-6W). 

From this, we conclude that azot is required to prevent tissue 
degeneration in the adult brain and lack of azof showed signs of 
accelerated aging. This suggested that azot could affect the 
longevity of adult flies (Figures OX and 6Y). We found that flies 



lacking azot {azoV'^) had a shortened lifespan with a median 
survival of 7.8 days, which represented a 52% decrease when 
compared to WT flies tp^ot*'*), and a maximum lifespan of 
18 days, 25% less than WT flies {azot*'*). This effect on lifespan 
was azot dependent because it was completely rescued by intro- 
ducing two functional copies of azof (Figures OX and 6Y). On the 
contrary, flies with three functional copies of the gene {azot*'*; 
azot*) showed an increase in median survival and maximum life- 
span of 54% and 17%, respectively. 

In conclusion, azot is necessary and sufficient to slow down 
aging, and active selection of viable cells is critical for a long life- 
span in multicellular animals. 

Death of Unfit Cells Is Sufficient and Required for 
Multicellular Fitness Maintenance 

Our results show the genetic mechanism through which cell se- 
lection mediates elimination of suboptimal but viable cells. Flow- 
ever, using flip-out clones and MARCM (Lee and Luo, 2001), we 
found that Azot overexpression was not sufficient to induce cell 
death in wing imaginal discs (Figures S6T-S6Y). Because Ffid is 
downstream of Azot, we wondered whether expressing Flid un- 
der the control of the azot regulatory regions could substitute for 
Azot function. 

In order to test this hypothesis, we replaced the whole 
endogenous azot protein-coding sequence by the cDNA of 
the proapoptotic gene hid {azot{KO; hid} flies; see Figure 7A). 
In a second strategy, the whole endogenous azot protein-cod- 
ing sequence was replaced by the cDNA of transcription factor 
Ga/4, so that the azot promoter can activate any UAS driven 
transgene {azot[KO; Gai4] flies (Figure 7B). We then compared 
the number of morphological aberrations in the adult wings of 
six genotypes: first, homozygous azot{KO; Gal4] flies that 
lacked Azot; second, azot{KO; hid] homozygous flies that ex- 
press Flid with the azot pattern in complete absence of Azot; 
third, azot*'* WT flies as a control; and finally three genotypes 
where the azot{KO; Gai4] flies were crossed with UAShid, UAS- 
sickle, another proapoptotic gene (Srinivasula et al., 2002), or 
UASp35, an apoptosis inhibitor. In the case of UASsickle flies, 
we introduced a second azot mutation to eliminate azot func- 
tion. Interestingly, the number of morphological aberrations 
was brought back to WT levels in all the situations where the 
azot promoter was driving proapoptotic genes {azot[KO; hid], 
azot{KO; Gal4] x UAShid, azot[KO; Gal4] x UASsickle, see 
Figures 7A-7J) with or without irradiation. On the contrary, 
expressing p35 with the azot promoter was sufficient to pro- 
duce morphological aberrations despite the presence of one 



Figure 6. azot Is Required to Prevent Tissue Degeneration in the Adult Brain and to Promote Lifespan 

(A-P) Brain integrity studies over time. (A) Axial plane of Drosophila WT brain counterstained with toluidine blue. (B-M) Magnification images of the central 
brain, counterstained with toluidine blue, showing degenerative vacuoles (white dots) of the following four genotypes overtime: (t)azot*'*, (2)azoV'^, {3)azot^'^\ 
azot*'*, and (A) azot*'*; azot*. (N-P) Number of neurodegenerative vacuoles. (N) Number of degenerative vacuoles per brain area (70 x 70 rim) after 1 day at 29”C 
(azot*'* n = 14, azoV'^ [p < 0.01] n = 8, azot^'^;azot*'* n = 16 and azot*'*; azot* [p < 0.01] n = 1 1). (O) Number of degenerative vacuoles per brain area after 7 days 
at 29°C (azot*'* n = 16, azoV'^ [p< 0.01] n = 16, azof“^“;azof^^* n = 7 and azot*'*; azot* [p < 0.01] n = 20). (P) Number of degenerative vacuoles per brain area after 
14 days at 29°C (azot*'* n = 7, azoV'^ [p < 0.01] n = 3, azot^'^\azot*'* n = 10 and azot*'*; azot* n = 7). 

(Q-V) Azot-positive cells (green, GFP) in azot{KO; gfp} homozygous flies after 1 day (Q and R), 7 days (S and T). and 14 days (U and V) at 29°C. DAPI is in blue. 

(W) Number of Azot-positive cells per brain area (50 x 50 rim) in azot{KO; gfp} homozygous flies after 1 day (n = 1 1), 7 days (n = 15), and 14 days (n = 18) at 29°C. 

(X) Lifespan studies of the same four genotypes at 29°C. 

(Y) Lifespan values, including median survival and maximum lifespan, for the four genotypes. 

Data are represented as mean ± SEM. 
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functional copy of azot (Figures S7A-S7H). Likewise, p35-ex- 
pressing flies (azot{KO; Ga!4}/azot*; UASp35) did not survive 
UV treatments (Figure S7I), whereas a percentage of the flies 
expressing hid (26%) or sickle (28%) in azof-positive cells 
were able to survive (Figure S7I). 

From this, we conclude that specifically killing those cells 
selected by the azot promoter is sufficient and required to pre- 
vent morphological malformations and provide resistance to 
UV irradiation. 

Death of Unfit Cells Extends Lifespan 

Next, we checked if the shortened longevity observed in azof'' 

flies could be also rescued by killing azof-expressing cells 
with hid in the absence of Azot protein. We found that azot 
{KO; hid} homozygous flies had dramatically improved lifespan 
with a median survival of 27 days at 29°C, which represented 
a 125% increase when compared to atzov'^ flies, and a 
maximum lifespan of 34 days, 41 % more than mutant flies (Fig- 
ures 7K and 7L). 

Similar results were obtained at 25°C (Figures 7M and 7N). 
We found that flies lacking azot (azof“^“) had a shortened 
lifespan with a median survival of 25 days, which represented 
a 24% decrease when compared to WT flies {azot*'*), and 
a maximum lifespan of 40 days, 31% less than WT flies 
{azot*'*). On the contrary, flies with three functional copies of 
fhe gene {azot*'*; azot*) or flies where azot is replaced by hid 
{azot{KO; hid} homozygous flies) showed an increase in median 
survival of 54% and 63% and maximum lifespan of 12% and 
24%, respectively. 

Finally, we tested the effects of dietary restriction on longevity 
of those flies (Partridge et al., 2005) (Figures S7J and S7K). We 
found that dietary restriction could extend both the median sur- 
vival and the maximum lifespan of all genotypes (Figures S7J and 
S7K). Interestingly, dietary restricted flies with three copies of the 
geneazof showed a further increase in maximum lifespan of 35% 
(Figure S7K). This shows that dietary restriction and elimination 
of unfit cells can be combined to maximize lifespan. 

In conclusion, eliminating unfit cells is sufficient to increase 
longevity, showing that cell selection is critical for a long lifespan 
in Drosophila. 

DISCUSSION 

Here, we show that active elimination of unfif cells is required to 
maintain tissue health during development and adulthood. We 



identify a gene (azot), whose expression is confined to subopti- 
mal or misspecified but morphologically normal and viable cells. 
When tissues become scattered with suboptimal cells, lack of 
azot increases morphological malformations and susceptibility 
to random mutations and accelerates age-dependent tissue 
degeneration. On the contrary, experimental stimulation of azot 
function is beneficial for tissue health and extends lifespan. 
Therefore, elimination of less fit cells fulfils the criteria for a hall- 
mark of aging (Lopez-Otin et al., 2013). 

Although cancer and aging can both be considered conse- 
quences of cellular damage (Greaves and Maley, 2012; Lopez- 
Otin et al., 2013), we did not find evidence for fifness-based 
cell selection having a role as a tumor suppressor in Drosophila. 
Our results rather support that accumulation of unfit cells affect 
organ integrity and that, once organ function falls below a critical 
threshold, the individual dies. 

We find Azot expression in a wide range of “less fit” cells, such 
as WT cells challenged by the presence of “supercompetitors,” 
slow proliferating cells confronted with normal proliferating cells, 
cells with mutations in several signaling pathways (i.e.. Wingless, 
JAK/STAT, Dpp), or photoreceptor neurons forming incomplete 
ommatidia. In order to be expressed specifically in “less fit” 
cells, the transcriptional regulation of azot integrates fitness in- 
formation from at least three levels: (1) the cell’s own levels of 
Flower'-^® isoforms, (2) the levels of Sparc, and (3) the levels of 
Lose isoforms in neighboring cells. Therefore, Azot ON/OFF 
regulation acts as a cell-fitness checkpoint deciding which viable 
cells are eliminated. We propose that by implementing a cell- 
fitness checkpoint, multicellular communities became more 
robust and less sensitive to several mutations that create viable 
but potentially harmful cells. Moreover, azot is not involved in 
other types of apoptosis, suggesting a dedicated function, 
and— given the evolutionary conservation of Azot— pointing to 
the existence of central cell selection pathways in multicellular 
animals. 

EXPERIMENTAL PROCEDURES 
In Situ Hybridization 

We followed the protocol described in Rhiner et al. (2010). Probe sequences 
are available upon request. 

Drosophila Genetics 

Stocks and crosses were kept at 25°C in standard media. The following stocks 
were used: ywf;tub > dmyc > Gal4/Cyo;UASgfp; azot::dsRed/TIVI6B; GMR- 
Gal4; azot::dsRed/TM6B; ywf;tub > dmyc > Gal4,azot“/Cyo;UASgfp; ywf;tub 



Figure 7. Culling Azot-Expressing Ceiis is Sufficient and Required for Muiticellular Fitness Maintenance 

(A and B) Knockin (Kl) schemes (A) azot{KO; Gal4} and (B) azot{KO;hid}. 

(C-F) Wings from 10- to 13-day-old flies and quantification of developmental aberrations of the following five genotypes: {C)azot*^*, (C and D]azot(KO; Gal4)/azot 
{KO; Gal4}, (C and E) azot{KO:hid}/azot{KO:hicl}. (C and F) azot{KO; Gal4}/azor ;UASsickle, and (C) azot{KO; Gal4}/azot ;UAShid. 

(G-J) Wings from 1 0- to 1 3-day-old flies and quantification of developmental aberrations after UV irradiation of the same five genotypes. Irradiation dose of 2 x 
10“^ J administered during pupal stage 0. 

(K and L) Comparative lifespan studies of genotypes azot{KO;hid}/azot{KO;hid) and azoV'^ at 29°C. 

(L) Median and maximum survival of genotypes azot{KO;hid}lazot(KO;hid} and azoV'^. 

(M and N) Lifespan studies at 25°C of the following four genotypes: (1) azot*'*, (2) aizoV'^ , (3) azot*'*; azot*, and (4) azot{KO;hid}/azot(KO;hid}. (N) Median and 
maximum survival of the four genotypes. 

(O) Scheme showing that specifically killing Azot-expressing cells with the general proapoptotic factor Hid is sufficient to prevent morphological malformations 
and rescue azot mutant phenotypes. 

Data are represented as mean ± SEM. 
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> dmyc > Gal4,azot“/Cyo:UASrfp: ywf;act > y+ > gal4,azot“/Cyo;UASgfp; 
ywf;act > y+ > Gal4/Cyo;UASRNAiazot: azot{KO;gfp}; azot{KO;hid}; azot{KO; 
Gal4}; UASbrk;act > cd2 > Gal4,UASgfp/TM6B; act > y+ > Gal4,UASgfp;azot:: 
dsRed/TM6B; w;flowerUbi-YFP,flowerLose-A-GFP,flowerLose-B-RFP; ywf; 
Ubigfp,MinuteFRT42/Cyo; ywf;FRT42/Cyo: hsFlp,UAS-CD8-GFP;GAL80 FRT 
40A/Cyo;tub > G4/TM6B; ywFlp;armZFRT40A/Cyo;MKRS/TM6B; ywf;patched- 
Gal4; apterous-Gal4; GMR-Gal4,UASeiger; RNAifwe‘"°^'^^ (Merino et al., 
2013); ywf;UASmfwe^; ywf;UASsparc/TM6B; UASfwe^°^'®: UASfwe^°^^*; 
UASfwe^°^® ^UASfwe^°^®"^; UASp35; UASpuckered; UASdAxin/TM3; UAShid; 
UASsickle; UASbax; UASbcl2; UASp53DN; UASRNAifweGD; UASRNAis 
parc{16678); UASRNAiazotGD{18166); UASRNAiazotKK(1 02353); UASRNAis 
cribble(Bloomington): UASRNAidlg{Bloomington); UASRNAihopscotch(Bloo- 
mington); UASRNAieigerGD; ywf;Cyo/if;UASazot/TM6B; ywf;Cyo/if;UASazot- 
HA/TM6B; ywf;Cyo/if;UASazotpm4Q12/TM6B; ywf; UASIacZ; and UASCSK-IR. 

Clone Induction 

Flip-out clones were generated after heat shock at 37°C between 5 and 
15 min. For ubiquitous expression experiments larvae were subjected to 
45 min heat shock for all cells to perform flip-out and activate Gal4 under 
the control of the actin promoter {act>Gal4). 

Azot Reporter: azot::dsRed 

The genomic region 3 kb upstream plus the full exon was cloned in pRe- 
dStinger vector using Xbal and Kpnl restriction sites. Primer sequences are 
available upon request. 

Overexpressing Constructs 

cDNA of azof was fully sequenced and subcloned into the pUASattB vector 
using Xbal and Kpnl restriction sites. In order to generate N- and C-terminal 
HA-tagged forms, the respective cDNAs were amplified with primers contain- 
ing the HA sequence and subcloned into Kpnl and Xbal sites of pUASattB. 
Primer sequences are available upon request. 

Azotpm4Q12 

Site-directed mutagenesis was used to create point mutations that changed 
glutamic acid (E) to glutamine (Q) as shown in Figure S1A. Primer sequences 
are available upon request. 

Azot Knockout Generation 

We followed the genomic engineering strategy described in Huang et al. 
(2009): homologous regions are shown in (Figure 1A). Primer sequences are 
available upon request. 

Knockin Generation 

Knockout founder line (Figure 2A) was used for the generation of knockin flies 
as described in Huang et al. (2009). cDNAof g/p, hid, and Gal4 was used for the 
generation of azot{KO; gfp}, azot{KO; hid}, and azof{K'0; Gai4] knockin lines. 
Primer sequences are available upon request. 

Immunohistochemistry 

Standard immunohistochemistry protocol was used for antibody detection 
(Rhiner et al., 2010). For the generation of specific antibodies against Azot, 
N-terminal peptide MEDISHEERVLILDTFR was used to immunize rabbits. 
Anti-Wingless (ms, 1:50) was from DSHB, anti-caspase-3 (rabbit, 1:100) 
was from Cell Signaling Technology, anti-KDEL (rabbit, 1;100) was from Ab- 
eam, anti-cytochrome c (mouse, 1:800) was from BD Pharmingen, anti-Hid 
(rabbit, 1:50) and anti-HA (rat, 1:250) were from Roche, and anti-pGal 
(mouse, 1:200) was from Promega. TUNEL staining performed as described 
(Lolo et al., 2012). Confocal images acquired with Leica SP2 and SP5 
microscopes. 

UV Treatments 

Treatments were performed using a UV Stratalinker 2400 machine (UV-B 
254 nm). Adult flies were subjected to 2 x 10“^ J dose of UV irradiation 
when they were 1-3 days old and analyzed for Azot and Flower isoform 
expression 24 hr later. For lifespan experiments after irradiation, a dose of 5 
X 10“^ J was used. Larvae and pupae were subjected to 2 x 10“^ J dose of 



UV irradiation, and Azot expression or developmental aberrations were 
analyzed. 

Longevity Assays 

Cohorts of 100 female flies (1-3 days old) of the same genetic background 
were collected and kept at 29°C or 25°C on standard food (3.4 I water, 
280 g maize, 36 g agar, 120 g yeast, 300 g sugar syrup, 32 g potassium, 6 g 
methyl, 20 ml propionic acid). Surviving flies were counted every 2 days (He 
and Jasper, 2014). 

Dietary Restriction Assays 

Cohorts of 100 female flies (1-3 days old) were collected and kept at 29°C on 
water-diluted standard food (one to one). Surviving flies were counted every 
2 days. 

Brain Studies 
Brain Integrity 

Adult flies kept at 29°C of the selected time points and genotypes were 
analyzed for the appearance of neurodegenerative vacuoles over time in the 
central brain as previously described (Kretzschmar et al., 1997). 

Azot Expression 

Adult flies azot{KO; gfp}/azot{KO; gfp} were kept at 29°C. The selected time 
points were analyzed for the number of GFP-positive cells in the central brain. 

Statistical Analysis 

For the rescue assay using azot KO in supercompetition (Figure 2E), rescue 
assay in supercompetition with azot RNAi and overexpression of the protein 
(Figures S2J-S2P), the rescue assay of clones with apicobasal defects and 
the clones with deficient Wg signaling (Figures S6N-S6R), and brain integrity 
studies over time (Figures 6A-6P), the data were analyzed with the K indepen- 
dent samples test. The post hoc DMS test was then used to detect significant 
differences. 

For the caspase-positive cells in azof^'"'^ and azot“''“ background (Figure 2D), 
the rescue assay in overexpression of Flower'®®® isoforms (Figures 2R-2T; Fig- 
ure S2T), and azot overexpression in clones (Figures S6T-S6Y), all data were 
analyzed with two independent samples test (Mann-Whitney U test). Levene 
test was used to analyze number of cleaved caspase-3-positive cells, rescue 
assay of Flower*"®®® isoforms, and number of azot-overexpressing clones. 

For the quantification of the number of developmental aberrations before 
and after irradiation treatment in azof^'"'^, azof^^“, and azot“^“, and azot“'"“; 
azof^^"^ background (Figures 3A-3E, 3L-3P, 7C-7J, and S7A-S7H), data 
were analyzed with the K independent samples test (Levene), and Levy-Tukey 
was used for post hoc analyses. 

In the rescue assay in supercompetition using RNAi (24 hr ACI) (Figures 
S2C-S2I), the data were analyzed with ANOVA test. 

In the quantification of eye size in apoptosis assay (Figures S3H-S3N), the 
data were analyzed with ANOVA. Bonferroni post hoc test was used to detect 
significant differences among genotypes. 

For the functional assays of azot in retinas (Figures 2G-2L), azot dose sen- 
sitive (Figures 2U-2W), rescue assay in overexpression of mouse flower^ iso- 
form (Figure S2U), and rescue assay of clones with apicobasal defects, and 
clones with deficient Wg signaling by azot RNAi (Figures S6D-S6M), all data 
were analyzed with Student’s t test. 

For the lifespan analysis (Figures 6X, 7K, 7M, and S7J), the log-rank test was 
used to study significant differences among the genotypes. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes seven figures and can be found with this 
article online at http://dx.doi.Org/10.1016/j.cell.2014.12.017. 
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SUMMARY 

MYC is a highly pleiotropic transcription factor whose 
deregulation promotes cancer. In contrast, we find 
that Myc haploinsufficient (Myc^'~) mice exhibit 
increased lifespan. They show resistance to several 
age-associated pathologies, including osteoporosis, 
cardiac fibrosis, and immunosenescence. They also 
appear to be more active, with a higher metabolic 
rate and healthier lipid metabolism. Transcriptomic 
analysis reveals a gene expression signature enriched 
for metabolic and immune processes. The ancestral 
role of MYC as a regulator of ribosome biogenesis 
is reflected in reduced protein translation, which is 
inversely correlated with longevity. We also observe 
changes in nutrient and energy sensing pathways, 
including reduced serum IGF-1 , increased AMPK ac- 
tivity, and decreased AKT, TOR, and S6K activities. 
In contrast to observations in other longevity models, 
Myc*'~ mice do not show improvements in stress 
management pathways. Our findings indicate that 
MYC activity has a significant impact on longevity 
and multiple aspects of mammalian healthspan. 



INTRODUCTION 

Myc is a helix-loop-helix leucine zipper transcription factor that is 
highly conserved among metazoans (Meyer and Penn, 2008). It 
was discovered as the transforming oncogene of the MC29 avian 
myelocytomatosis virus and subsequently as the cellular proto- 
oncogene activated in Burkitt’s lymphoma. Increased expres- 
sion of the MYC protein strongly promotes cell proliferation 
and has been documented as a frequent event in a wide variety 
of human cancers (Dang, 2012). 

CrossMark 



By interacting with partners such as M/\X and ZBTB17 (MIZ1), 
MYC can either activate or repress transcription (Meyer and 
Penn, 2008). Much effort has been focused on understanding 
how MYC influences signaling networks and it has emerged as 
a major regulatory hub. In addition to its role in cancer, it is 
also critically involved with many essential cellular processes, 
and the mouse knockout is embryonic lethal. By conservative 
estimates, 15%-20% of all genes are directly regulated 
by MYC, including genes that play key roles in metabolism, ribo- 
some biogenesis, cell cycle, apoptosis, differentiation, and stem 
cell maintenance (Dang, 2012). 

While age does not have a significant effect on Myc expression 
in any mouse tissue examined (Zahn et al., 2007), many of the 
biological processes regulated by MYC have also been impli- 
cated in aging and age-associated diseases. MYC upregulates 
major biosynthetic pathways leading to cellular growth and pro- 
liferation and enhances energy production through glycolysis 
and oxidative phosphorylation (Dang, 2012). In contrast, calorie 
restriction (CR) and reduction of insulin/IGF-1 signaling promote 
longevity (Gems and Partridge, 2013). MYC also increases 
protein synthesis by positively regulating ribosome biogenesis 
(Brown et al., 2008), while reducing translation can extend life- 
span (Johnson et al., 2013). 

MYC overexpression results in an increase in reactive oxygen 
species (ROS) and DNA damage (Vafa et al., 2002), which are 
believed to contribute to the progression of aging (Hoeijmakers, 
2009). Stem cell populations decline in number and functionality 
with normal aging (Cho et al., 2008; Jang et al., 2011), and 
ectopic MYC expression depletes stem cell populations (Eilers 
and Eisenman, 2008). MYC may also affect the inflammatory 
state that accompanies aging, because it directly regulates 
expression of some cytokines (Whitfield and Soucek, 2012) 
and may influence the composition of the leukocyte population 
via its roles in proliferation and stem cell maintenance (Eilers 
and Eisenman, 2008; Wang et al., 2011a). 

The overall trend suggested by this evidence is that increased 
MYC activity promotes several processes that have been 
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Figure 1. MYC Expression, Longevity, Body 
Mass, and Composition 

(A) MYC mRNA and protein expression. mRNA was 
measured by qRT-PCR. Protein was measured by 
immunoblotting of tail fibroblasts or by IF of liver 
sections. Fibroblasts: n = 3, 3 months, females. 
Liver: n = 3-5, 5-9 months, both sexes. 

(B) Survival of Myc*''*' (blue) and Myc*'~ (red) mice. 
Each data point shows one animal (N, number of 
animals in each cohort). 

(C) Lifelong trends of whole body weights. Median 
weights (3 week sliding window) of the aging co- 
horts in (B). Inset: weights at 500 days and relative 
weights of Myc*'~ animals. 

(D) Relative adiposity. Animals were scanned by 
micro-CT imaging. Left: volume of all white fat as 
percent of total volume of the animal. Right: vol- 
ume of visceral fat as percent of total fat. n = 6, 
5 months. 

Error bars represent SEM. 

See also Figure SI and Tables S1 and S2. 
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connected with aging and age-associated diseases. To address 
the roie of Myc in aging, given that a compiete ioss of Myc 
is embryonic iethai whiie overexpression promotes cancer, 
we estabiished a partiai ioss of function (hypomorphic) modei 
in the mouse. We previousiy found that ceils knocked out for 
one copy of the Myc gene display a variety of mild but distinct 
phenotypes, including reduced rates of proliferation (Mateyak 
et al., 1997). Similarly, heterozygous [Myc*'~] mice appear 
normal and healthy, but have a 20% smaller body mass 
(de Alboran et al., 2001; Trumpp et al., 2001). We used this 
simple constitutive hypomorphic model to address the conse- 
quences of reduced MYC activity on aging. 

RESULTS 

Experimental Strategy 

Mice with all coding exons of one copy of the Myc gene flanked 
by LoxP sites (de Alboran et al., 2001) were bred to mice ex- 
pressing germline Cre recombinase, converting the floxed allele 
to a deletion, and subsequently backcrossed to C57BL76 for 
ten generations. The expected decreases of MYC mRNA and 
protein levels in Myc*^~ mice were confirmed in tail fibroblasts 
and several tissues (Figure 1A and Figures S1A-S1C available 
online). MYC chromatin immunoprecipitation of liver extracts 
from Myc*^~ animals reproducibly gave lower yields, indicating 
that in vivo a smaller fraction of DNA was bound by MYC 
(Figure SID). 

Myc*'~ Mice Have Increased Longevity 

Large cohorts of both sexes and genotypes were maintained in a 
barrier facility and allowed to die of natural causes. A highly sig- 
nificant increase in median lifespan was observed: 10.7% for 



Male Female 



males, 20.9% for females, and 15.1% 
for both sexes combined (Figure IB; 
Tables SI and S2). We do not know the 
reason for the greater effect in females; 
we note, however, that control {Myc*'*) females were shorter 
lived than males (Figure S1E), an observation that has been 
reported in several colonies of C57BLV6 mice (Ladiges et al., 
2009). Thus, while Myc*'* males lived 8.8% longer than Myc*'* 
females, Myc*'~ males and females had equivalent lifespans. 
Maximum lifespans were commensurately increased, with nearly 
all of the mice surviving to the longest-lived decile being of 
Myc*'~ genotype. The instantaneous mortality rate was lower 
for Myc*'~ mice across all ages (Figure S1F), indicating that the 
health benefits are not limited to a particular age. 

Myc*'~ Mice Are Smaller and Develop and Reproduce 
Normally 

Myc*'^ mice are healthy, robust, and can be group-housed 
with their Myc*'* littermates without any adverse consequences. 
Animals of both sexes are 15%-20% smaller as adults (Fig- 
ure 1C). These differences were apparent at weaning and 
the weight curves of Myc*'* and Myc*'~ mice were parallel over 
time. Mass has been noted as a predictor of longevity in 
mice, although this effect is strain-specific (Anisimov et al., 
2004). We found no correlation between longevity and the weight 
attained by individual mice within any cohort (Figure S1G). 

The decreased size of Myc*'~ mice is proportional across all 
parts of the body, with the mass ratio of major organs to total 
body weight not significantly different between genotypes (Fig- 
ure SI H). Prior examination of Myc hypomorphic mice did not 
find significant changes in cell size among several organs, which 
we confirmed in our animals (Figure SI I). Of particular interest 
was adipose tissue, because in several long-lived mouse models 
reduced adiposity has been associated with longevity. Myc*'~ 
and Myc*'* mice have a similar proportion of adipose tissue rela- 
tive to total body volume and also have similar proportions of 
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visceral and subcutaneous adipose depots (Figure 1 D). Consis- 
tent with this, Myc*'~ and Myc*'* mice had similar levels of both 
leptin and adiponectin in serum at both 6 and 22 months, 
implying similar adipose tissue function (Figures 2A and 2B). 
The reduced size of Myc*'^ mice may be due to their lower levels 
of serum IGF-1 , which we documented in both young and old 
animals (Figure 2C). 

The concept of a tradeoff between longevity and repro- 
duction, namely that increased longevity may bear the cost 
of decreased efficacy or duration of reproductive ability, has 
been widely discussed. We tested both sexes for reproductive 
fitness and longevity and found no significant differences be- 
tween the two genotypes (Figures 2D and S2A-S2C). Myc*'~ 
females displayed slightly accelerated vaginal patency (Fig- 
ure 2E), indicating that sexual maturation is not retarded. The 
increased longevity of Myc*'~ mice is thus not associated with 
either slower development or reduced fecundity. We also per- 
formed skin wound-healing assays and did not find statistically 
significant differences between the genotypes, even in old 
animals (Figure S2D). 

The Major Cause of Death Is Lymphoma in Both 
Genotypes 

We performed autopsies on all adequately preserved mice at 
time of death and a subset of animals were subjected to a histo- 
pathological analysis of 18 tissues. Seventy-nine percent of 
Myc*'* and 81 % of Myc*'~ mice were deemed to have died of 
cancers (of which 80% and 87%, respectively, were lymphomas; 
Figure 2F). The spectrum of pathologies was typical of the 
C57BL76 strain of mice and very similar in both genotypes (Table 
S3). Flence, the increased longevity of Myc*'~ mice is not due to 
fewer cancer-caused deaths. 



Figure 2. Metabolic Hormones, Fecundity, 
and Cancer Incidence 

(A-C) Levels of adiponectin, leptin, and free IGF-1 
in plasma. Blood samples were collected after 
overnight fasting, n = 5-12, 5 and 22 months, 
females. 

(D) Lifetime reproductive output. Myc* 
Myc*'* females (n = 5) were bred with Myc 
males (left), and Myc*'^ and Myc*'* males (n = 7-9) 
were bred with Myc*'* females (right). 

(E) Sexual maturation of females, n = 12-19. 

(F) Left: cause of death determined by a veterinary 
pathologist, n = 1 9 Myc*'*, 38 Myc*'^, both sexes. 
Right: incidence of macroscopic cancer noted at 
time of autopsy {Myc*'*, 73.4%; Myc*'^, 53.7%; 
p = 0.03). n = 64 Myc*'*, 67 Myc*'^, both sexes. 

(G) Histopathological analysis was used to deter- 
mine the spread (left; number of affected tissues) 
and severity (right; maximum grade) of lymphoma, 
n = 1 2 Myc*'*, 27 Myc*'^, both sexes. 

Error bars represent SEM. 

See also Figure S2 and Table S3. 



We, however, also noted evidence 
for decreased progression of cancer 
in Myc*'~ animals. First, careful macro- 
scopic examination at time of autopsy re- 
vealed fewer tumors visible to the naked eye (Figure 2F). Second, 
histopathological analysis showed that by the time of death lym- 
phoma spread to significantly fewer organs (Figure 2G). Third, 
while the severity (maximum grade) of lymphoma at time of death 
was similar in both genotypes, Myc*'~ mice survived consider- 
ably longer. Others have reported that reducing cancer preva- 
lence alone does not substantially extend lifespan (Matheu 
et al., 2004). Additional data from multiple organ systems pre- 



sented below indicate that other effects of the Myc* 
are likely to contribute to the extended lifespan. 



■ genotype 



Gene Expression Analysis Points to Changes in 
Metabolism, the Immune System, and a Unique 
Expression Signature 

Microarray expression profiling was performed on liver, skeletal 
muscle, and white adipose (gonadal) tissues in 5- and 24-month- 
old animals. A principal component analysis showed that the first 
component was tissue, the second age, and the third genotype, 
representing 61.6%, 30.0%, and 2.9% of the total variability, 
respectively (Figure 3A; Table S4). The relatively small effect of 
the Myc*'~ genotype is notable, accounting for 10-fold fewer 
changes than those due to age. For all three tissues, both the 
number of differentially expressed genes and magnitude of 
average change with age were lower in Myc*'~ animals (Fig- 
ure 3B). This reduction in the apparent aging of the transcriptome 
suggests that Myc*'^ mice are long-lived because of widespread 
changes that affect multiple tissues. 

Differentially expressed genes were enriched in pathways 
related to the immune system and metabolism, especially that 
of lipids (Figure S3A). We also evaluated which upstream regula- 
tors could explain this pattern of transcription (Table S5). When 
sorted by significance for the effect of genotype in old animals. 



Cell 160, 477-488, January 29, 201 5 ©201 5 Elsevier Inc. 479 




Cell 



A Gene expression 
changes 



Liver 



Muscle 



Adipose 




B Effect of age on the transcriptome 

25 

20 
15 
10 
5 
0 

Liver Muscle Adip. 



C Meta-anaiysis 

CR Resveratrol 




2500 




0 ) 

O) 

ra 



u 

O) 

c 

ra 



c 

(S 

o 



Liver Muscle Adip. 




Figure 3. Transcriptome Analysis 

(A) Gene expression changes. Three parameters were compared: genotype Myc*'~), age (5, 24 months), and tissue (liver, muscle, adipose). Genotype, 

age, and tissue comparisons are connected with diagonal, vertical, and horizontal lines, respectively. The number of genes changing expression is shown above 
and below each line. Red numbers: genes upregulated in Myc*'* versus Myc*'^ animals (MYC-activated genes); blue numbers: genes upregulated in Myc*'^ 
versus Myc*'* animals (MYC-repressed genes). A 1.5-fold cutoff and a FDR threshold of <5% were used, n = 5-8 male animals. 

(B) Effect of age on the transcriptome. Left: number of genes whose expression changes with age by more than 50% in either direction; hatched area represents 
genes in common between the two genotypes. Right; the median change in expression with age (expressed as %) across all expressed genes. 

(C) Meta-analysis. Differentially expressed genes in Myc*'^ versus Myc*'* animals were compared with genes similarly recovered in studies of calorie restriction, 
metformin, or resveratrol treatment (Martin-Montalvo et al., 2013; Pearson et al., 2008). Upward arrows indicate genes upregulated in the long-lived condition 
(and conversely for downward arrows). Expression in liver of old male mice was compared. Skeletal muscle showed the same trends. 

See also Eigure S3 and Tables S4, S5, and S6. 



regulators of lipid metabolism and the immune system were 
prominently enriched. Transcription factors that promote lipid 
biosynthesis and adipogenic fate were predicted to become 
more active with age in tissues in which adipogenesis and lipid 
accumulation are considered pathogenic, such as the liver and 
skeletal muscle, and less active with age in adipose tissue. 
These changes were reversed in old Myc*'~ tissues. The effects 
of numerous immune system regulators (including interferons, 
interleukins, colony stimulating factors and NF-kB) were strongly 
predicted to increase with age in all three tissues, and this effect 
was predicted to be counteracted in muscle of Myc*'~ mice. 

Expression of xenobiotic metabolism enzyme (XME) genes in- 
creases with age and is further elevated in male mice that are 
long-lived due to genetic, pharmacologic, or dietary interven- 
tions (female mice express these genes at an elevated and 
mostly constant level) (Li et al., 2013). The XME gene expression 



signature in male Myc*'~ mice did not resemble the signatures 
observed in several mouse longevity models (Table S6). This 
analysis is consistent with the results of stress tests performed 
in cell culture with tail fibroblasts established from adult Myc*'~ 
mice (Figure S3D), which did not display increased resistance 
to any of the stressors previously shown to be counteracted by 
cells from a variety of long-lived mice. We also note that 
compared to other longevity models, some of which display 
very pronounced changes in XME gene expression, the changes 
in our animals were very modest (Table S6). 

In a broader meta-analysis, we compared our gene expression 
data sets with those of other lifespan and healthspan extending 
interventions: CR and treatments with resveratrol or metformin. 
While some of these interventions show considerable overlap 
with one another in differentially expressed genes, especially 
CR and metformin (Martin-Montalvo et al., 2013), none were 
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similar to the Myc*'~ signature (Figures 3C and S3B). A meta- 
analysis based on Gene Ontology (GO) pathways also failed to 
indicate substantial overlap (Figure S3C). As before, OR and 
metformin showed much greater similarity to each other than 
either with Myc*'~. The few universally enriched GO terms, 
however, further implicated metabolism and inflammation; for 
example, in muscle the only commonly enriched GO term was 
immune response, while in liver high density lipoprotein binding 
and hormone activity were two of only five enriched GO terms. 

Myc*'~ Mice Show Evidence of Improved Healthspan in 
Multiple Tissues 

The pathological effects of aging in several organ systems were 
found to be attenuated in Myc*'~ mice. Cardiac fibrosis increases 
with age and in diabetes and obesity and has been found to be 
reduced in several long-lived mouse models (Dai et al., 2009). 
Myc*'~ mice had less cardiac fibrosis in old age than Myc*'* 
animals (Figure 4A). An important component of healthspan in 
females is osteoporosis (Syed and Melim, 2011). In Myc*'* 
females, both bone volume and trabecular number declined 
with age, whereas trabecular spacing increased (Figures 4B 
and 40). Using all these parameters, old Myc*'~ mice were indis- 
tinguishable from young Myc*'* animals, indicating that Myc*'~ 
females do not develop osteoporosis by 22 months of age. 
Musculoskeletal and neurological performance also decline 
with age, and the rotarod test is a frequently used measure of 
motor coordination that reflects these parameters. Old Myc*'~ 
mice were able to remain on the rotarod nearly twice as long 
as Myc*'* animals of the same age (Figure 4G), indicating an 
attenuation of the effects of aging on motor function. 

Lipid metabolism is strongly affected by aging. Flepatic lipid 
droplets (LDs) are small membrane-bound organelles that store 
fatty acids and cholesterol and regulate triglyceride metabolism 
(Fujimoto and Parton, 2011). LD activities occur mainly at their 
surface and hence their size rather than aggregate volume is 
a relevant measure of functional capacity. We found that while 
total hepatic LD area decreased ~2-fold in Myc*'* animals with 
age (Figure S4A), average LD size increased by over 3.5-fold 
(Figure 4D). In contrast, total LD content in Myc*'~ mice did not 
change, and LD size increased to a lesser extent. The relative 
preservation of LD surface area in Myc*'~ animals is also 
consistent with increased expression of genes involved in their 
biogenesis (P//n2, Cidec, Fitm1). 

Another important healthspan parameter of lipid metabolism is 
cholesterol. Cholesterol levels in mouse liver increase with age, 
and interventions that promote longevity, such as CR, result in 
decreased cholesterol synthesis (Tsuchiya et al., 2004). In our 
microarray data sets, the expression of cholesterol biosynthetic 
genes increased with age in Myc*'* mice. In contrast, almost all 
genes in this pathway showed lower expression in Myc*'~ ani- 
mals, which we confirmed by qRT-PCR (Figure 4E; this included 
the rate limiting enzyme, Hmgcr, and the master transcriptional 
regulator, Srebf2). Furthermore, both esterified and nonesterified 
cholesterol was reduced in serum of Myc*'~ mice, as well as in 
liver, where it is synthesized and stored (Figure 4F). 

Immunosenescence results in part from thymic involution as 
well as replicative senescence of someT cell populations (Akbar 
and Flenson, 2011). Because few T cells are produced by the 



thymus in older animals, the T cell pool is maintained by the 
expansion of existing memory T cells, which is greater for cyto- 
toxic CD8+ than regulatory CD4+ T cells (Goronzy and Weyand, 
2005). Naive T cells do not proliferate. As previously reported, 
the ratios of both CD4+ to CD8+ T cells and of naive to memory 
T cells declined with age in normal {Myc*'*) mice, but both were 
significantly elevated in Myc*'~ animals at 24 months of age (Fig- 
ures 5B and 5C). We found no effect of genotype on total T cell 
levels (Figure 5A), the presence of macrophages in several tis- 
sues (Figure S4B), or total white blood cell counts (Figure S4C). 
These results indicate that Myc*'~ animals can maintain younger 
and presumably healthier proportions of T cell populations. 

We next examined age-related changes in the bone marrow 
and thymus where T cells develop and mature. In C57BL76 
mice the thymus begins to involute at 10 months of age and 
very little remains by 20 months. Thymic mass was slightly 
decreased in Myc*'~ mice although the rate of involution was 
not changed; adiposity and the proportion of senescent cells 
were also not affected by genotype (Figures S4D and S4E). 
Flence, the younger T cell profiles in Myc*'~ animals are unlikely 
to be caused by reduced thymic involution. 

Flematopoietic stem cells (FISC) differentiate through several 
subpopulations: long term FISC (LT-HSC) differentiate into 
short term FISC (ST-FISC), which can then differentiate into either 
common myeloid progenitors (CMP) or common lymphoid pro- 
genitors (CLP). With normal aging, the CLP/CMP ratio decreases 
by ~3-fold (Cho et al., 2008). Flow cytometry of FISC showed 
that 16-month-old Myc*'~ mice had a higher CLP/CMP ratio 
relative to Myc*'* animals (Figures 5D and 5E). This is further 
evidence for decelerated aging of the hematopoietic system 
and might be responsible for the observed changes in T cell pro- 
portions. Consistent with this, we found that Myc*'~ mice have 
a lower ratio of ST-FISC to LT-FISC (Figure 5F), suggesting that 
LT-FISC can persist in greater numbers and/or functional 
capacity into old age. 

Myc*'~ Mice Do Not Show Changes in Pathways Linked 
to Macromolecular Damage 

A commonly invoked cause of aging is the accumulation of 
damage to macromolecules, particularly from oxidative stress. 
Genotoxic stress, manifested as various forms of DNA damage, 
increases dramatically with age (Floeijmakers, 2009). We exam- 
ined the frequency of DNA damage foci, and while we saw clear 
evidence of the age-associated increase, the Myc genotype had 
no effect (Figure 5G). Genotoxic stress and other forms of dam- 
age can lead to apoptosis, which has been found to rise with age 
in several tissues (Kujoth et al., 2005). While we observed the 
increase with age, the changes were equivalent in Myc*'* and 
Myc*'~ mice (Figure 5FI). Genotoxic stress is also a major trigger 
of cellular senescence. As with apoptosis, however, Myc geno- 
type did not affect the age-associated increase in cellular senes- 
cence (Figure S4E). 

The cyclin-dependent kinase inhibitors p21 {Cdknia) and pi 6 
{Cdkn2a) are important regulators of cellular senescence. As ex- 
pected, both p21 and pi 6 were upregulated with aging (Figures 
S4F and S4G). The expression of p21 , which responds strongly 
to DNA damage and ROS, was unaffected by Myc genotype in 
five tissues. The response of pi 6, whose regulation is not well 
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Figure 4. Amelioration of Age-Associated Phenotypes 

(A) Cardiac fibrosis was scored in ventricular cross sections using Masson’s trichrome stain, n = 1 1-14, 22-24 months, both sexes. 

(B) Osteoporosis in females was assessed using micro-CT analysis, n = 3-7, 5 and 22 months. 

(C) Trabecular spacing and number were scored by micro-CT, as above. 

(D) Liver sections were stained with Oil Red O. n = 6, 5 and 24 months, males. 

(E) Gene expression in liver was measured by qRT-PCR. Intermediates in the cholesterol biosynthetic pathway are shown on the left and the corresponding genes 

on the right. Data are normalized to for each comparison, n = 4, 24 months, males. 

(F) Total and nonesterified cholesterol in liver extracts and serum. Normalized to Myc*^'^. n = 5-6, 24 months, males. 

(G) Animals of average weight were chosen for rotarod tests, and their performance was corrected for their weight, n = 3-4, 24 months, males. 

Error bars represent SEM. 

See also Figure S4. 



understood, varied between tissues: it was unaffected in Myc*'~ 
reiative to Myc*'* mice in heart, reduced in liver and spieen, and 
increased in iung. F 2 -isoprostanes, products of iipid peroxida- 
tion, are a sensitive and accurate biomarker of oxidative status. 



As expected, we found that ieveis of p 2 -isoprostanes rose 
with age, but the changes were the same in Myc*'* and Myc*'~ 
animais (Figure 5I). Hence, Myc*'~ animais do not seem to be 
protected from age-associated increases in ROS, or from the 
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Figure 5. Immunosenescence, Stress Defenses, and Metabolic Activity 

(A) Total T cells (CDS"^) were analyzed by flow cytometry as % of total peripheral lymphocytes, n = 8-12, 5 and 24 months, males. 

(B and C) Ratio of naive to memory T cells (CD447CD44‘^) (B) and ratio of helper to killer T cells (CD4VCD8‘^) (C) were measured in the same samples. 

(D) Proportions of common myeloid and lymphoid progenitors (CMP, CLP), and short-term and long-term hematopoietic stem cells (ST-HSC, LT-HSC) were 
scored as % of Lin“ cells in bone marrow (tibia and femur), n = 6, 1 6 months, females. Normalized to for each comparison. 

(E and F) Ratio of CLP to CMP (E) and ratio of ST-HSC to LT-HSC (F) in the same samples. 

(G) 53BP1 -positive cells were visualized by IF in liver sections, n = 5-6, 5 and 25 months, males. 

(H) Apoptotic cells in liver were identified by IF with an antibody to cleaved caspase-3. n = 6-8, 5 and 22 months, females. 

(I) F 2 isoprostane levels were measured in liver extracts using gas chromatography and mass spectrometry, n = 4-7, 5 and 23-27 months, males. 

(J) O 2 consumption by young animals over a 24 hr period. Statistical significance in was computed using two-way ANOVA (time, genotype). Genotype factor was 
significantly different; p < 0.001 . n = 8, 5 months, both sexes. 

(K) O 2 consumption by old animals. P(genotype) < 0.001 . n = 7-8, 1 7-22 months, males. 

Error bars represent SEM. 

See also Figure S5. 
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Figure 6. Spontaneous Activity, Energy 
Metabolism, and Signaling Pathways 

(A) Spontaneous home cage activity. Of the 
four categories of behaviors (micromovements, 
sleeping, consumption, and active behaviors), 
micromovements and active behaviors were 
statistically different, n = 6, 16-18 months, males. 

(B) AMP concentrations in extracts of muscle, 
n = 5-7, 25-30 months, both sexes. 

(C) AMP to ATP ratio in the same sampies (B). 

(D) The ratio of phosphorylated (Thr172) to total 
AMPKa in muscle was determined by immuno- 
blotting. Data are normalized to Myc*'* for each 
comparison, n = 4, 9-1 1 months, females; n = 3, 
25-30 months, both sexes. 

(E) Ribosomal RNA content of liver, n = 5, 
23-25 months, male. 

(F) Translation rates in live animals (liver) were 
assessed using ^H-phenylalanine incorporation 
into total protein. Muscle showed the same trend, 
n = 5, 5-7 months; males. 

(G) The ratio of phosphorylated (Ser473) to total 
AKT in liver and muscle, n = 4, 9-1 1 months, female 
(same samples as in D). Normalized to Myc*'*. 

(H) The ratio of phosphorylated (Ser235/236) to 
total S6 ribosomal protein in liver and muscle 
(same samples as in G. Normalized to Myc*'*. 
Error bars represent SEM. 

See also Figure S6. 



consequences of this and other forms of stress, such as DNA 
damage, apoptosis, and senescence. 

Effects on Metabolic Pathways that Regulate Aging 

Metabolic rate, measured by O 2 consumption and CO 2 produc- 
tion, declines during normai aging and is increased by CR and 
in the iong-iived Ames and GHR-KO dwarf mice (Bartke and 
Westbrook, 2012). interestingiy, Myc*'~ mice have a signifi- 
cantiy higher metaboiic rate than Myc*'* animais (Figures 5J, 
5K, S5A, and S5B). Aithough smaiier in magnitude, this trend 
is already apparent in young Myc*'~ mice. Myc*'^ animais 
also have higher food and water consumption (Figures S5C 
and S5D). Animais of both genotypes had equivaient body tem- 
peratures (Figure S5E). The higher metaboiism of Myc*'~ mice 
might be needed to thermoreguiate their smaiier body mass, 
which wouid be expected to predispose to hypothermia. A 
smaii but significant increase in mitochondriai copy number 
was found in skeietai muscie (Figure S5F). We aiso assessed 
spontaneous activity using a home-cage monitoring system 
based on automated computer anaiysis of continuous video re- 
cordings (Jhuang et ai., 2010) and found that 16- to 18-month- 
oid Myc*'~ mice dispiayed a higher levei of active behaviors and 
a iower ievei of micromovements (Figures 6A and S6A-S6D). Of 
the active behaviors, hanging from the wire cage roof was 
notabie as being aimost compieteiy absent in Myc*'* animais. 
The iarge difference in this high-energy behavior is consistent 
with the notabiy improved rotarod performance (Figure 4G) 
and iack of osteoporosis (Figures 4B and 4C) in oider Myc*'~ 
animais. 

To further investigate these metabolic changes, we measured 
the ieveis of the major energy metaboiites ATP, ADP, and AMP. 
We found a smaii but significant increase in AMP ieveis and the 



AMP to ATP ratio in muscie of oid Myc*'~ mice (Figures 6B and 
6C), and the same trends were observed in young animais (Fig- 
ures S6E and S6F). There was aiso a trend toward higher ADP 
Ieveis (Figure S6G) and similar overall trends in iiver (Figures 
SGI and S6J). Although small in magnitude, in aggregate these 
changes are suggestive of a iower energy sfatus in Myc*'~ ani- 
mais, a condition known to activate AMP-dependent kinase 
(AMPK). In agreement, we observed a significantly higher ratio 
of Thr172-phosphoryiated AMPKa to totai AMPKa (Figure 6D), 
indicative of its activation. 

MYC is a major reguiator of ribosome biogenesis (Brown 
et ai., 2008). Indeed, analysis of our microarray dafa suggested 
that genes encoding ribosomai proteins are downregulated in 
Myc*'~ tissues (Figure S6K). We verified this effect by biochem- 
icai measurements of rRNA content which showed a smaii but 
significant decrease in Myc*'~ mice (Figure 6E). Finaiiy, to directiy 
query the rate of translation, we measured the in vivo incorpora- 
tion of a radioactive amino acid into totai protein and also found a 
significant decrease in Myc*'~ mice (Figure 6F). 

Myc*'~ mice have reduced Ieveis of circuiating IGF-1 (Fig- 
ure 2C). To more directiy examine signaiing through this 
pathway, we assessed the activation status of protein kinase B 
(AKT). We observed a significantly lower ratio of Ser473-phos- 
phorylated AKT to total AKT in iiver and muscie of aduit Myc*'~ 
mice (Figure 6G), indicative of reduced activity. Both AMPK 
and AKT regulate mTOR and increased AMPK and decreased 
AKT activity wouid promote reduced mTOR activity. We as- 
sessed S6 protein kinase activity, a downstream target of 
mTOR in the protein transiation pathway, by measuring the 
phosphoryiation state of ribosomai S6 protein. We found a iower 
levei of phosphoryiation in both iiver and muscie of aduit Myc*'~ 
mice (Figure 6FI). 
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Figure 7. The Effects of Myc on Healthspan and Longevity 

(A) Phenotypes of mice demonstrating their interconnectedness and impact on healthspan (the key under the drawing applies to this panel only). 

(B) Pathways affected by Myc hypomorphism and their relationship to increased longevity. The components investigated in this report are highlighted in red. 



DISCUSSION 

Myc*'~ mice have significantly extended median and maximum 
lifespans in both sexes and a reduced mortality rate across 
all ages relative to their normal Myc*'* littermates. Myc haploin- 
sufficiency in Drosophila also extends lifespan, pointing to a 
deep conservation of the underlying processes (Greer et al., 
2013). Myc*'~ mice display ameliorated aging phenotypes 
across a variety of pathophysiological processes in multiple 
organs, the breadth of which suggests that their increased 
longevity is not attributable to the prevention of a specific fatal 
disease, but rather to a broadly increased healthspan. 

Aging Myc*'~ mice have healthier lipid and cholesterol meta- 
bolism, stronger bones, less fibrosis, reduced cancer progres- 
sion, a higher metabolic rate, improved motor control, and 
reduced immunosenescence compared to their normal litter- 
mates. All of these phenotypes oppose the effects of normal 
aging and are shared with many other long-lived mouse models 
(Figure 7; Table S7). For example, CR and Ames dwarf mice also 
have an increased metabolic rate (Bartke and Westbrook, 201 2); 
CR and metformin-treated mice have lower cholesterol (Martin- 
Montalvo et al., 201 3; Tsuchiya et al., 2004); Ames dwarf, CR and 
rapamycin or metformin administration ameliorate cancer (Ikeno 
et al., 2003; Martin-Montalvo et al., 2013; Neff et al., 2013); and 
CR, Ames dwarf, and metformin-treated mice have better motor 
coordination (Brown-Borg et al., 2012; Lanza etal., 2012; Martin- 
Montalvo et al., 2013). All of these long-lived models show 
reduced body mass. Flowever, Myc hypomorphic mice also 
exhibit contrasting phenotypes; for example, rapamycin admin- 
istration increases serum cholesterol and does not improve 
motor coordination (Neff et al., 2013), Ames dwarf mice have 
decreased bone density (Fleiman et al., 2003), and CR reduces 
fecundity (Lee and Longo, 2011). Compared to most other 
longevity extending interventions, many of which compromise 



some aspects of healthy physiology, we have to date not found 
significant physiological deficits in Myc*'~ mice. 

Flow can MYC coordinate these effects? It took several de- 
cades of work to elucidate the complex spectrum of genes regu- 
lated by this transcription factor. In addition to directly binding the 
promoters of its target genes and upregulating them, MYC also 
exerts strong secondary effects. For example, it can upregulate 
the expression of other transcription factors, such as TFAM 
(Gomes et al., 2013). It can modulate gene expression elicited 
by other factors, such as ZBTB17 (MIZ1) by directly binding to 
them (Flerkert and Eilers, 2010), or by displacing some factors, 
such as FOX03A from their promoters (Peck et al., 201 3). Finally, 
MYC regulates a considerable number of microRNA (miRNA) 
genes, with widespread effects on gene expression (Dang, 201 2). 

Many MYC targets, direct or indirect, are involved in anabolic 
and energy producing processes, and their upregulation is 
believed to be the major mechanism by which MYC promotes 
cancer (Dang, 2012). The largest category of genes directly up- 
regulated by MYC are involved in protein translation, and regula- 
tion of the ribosome biogenesis (RiBi) regulon is believed to be 
the core and most ancient function of MYC, extending back to 
primitive holozoans, the first appearance of the MYC-MAX com- 
plex in evolution (Brown et al., 2008). We found that the expres- 
sion of Rpl and Rps genes was coordinately reduced in Myc*'~ 
tissues. In addition, levels of mature rRNA and in vivo translation 
rates were decreased. Reducing translation, by a variety of 
means and in multiple species, is well known to extend lifespan. 
For example (among many others), yeast and nematodes with 
mutated ribosomal protein genes. Drosophila overexpressing 
TSC1 orTSC2, and mice fed rapamycin or mutated in S6 kinase 
1 are all long-lived (Johnson et al., 2013). Given the major and 
direct role of MYC on RiBi regulon expression, it is likely that at 
least part of the longevity of Myc*'~ mice can be attributed to 
effects on translation. 
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TOR is a central hub in the networks that sense nutrients and 
energy status, and MYC plays a key downstream role in these 
pathways (Figure 7). For example, in Drosophila MYC was re- 
ported to mediate the protein biosynthetic capacity in response 
to insulin signaling (Teleman et al., 2008). Both TOR and MYC are 
activated by receptor tyrosine kinase signaling, through RISK, 
RAS, and AKT, with TOR providing an acute response by 
posttranslationally regulating its targets and MYC enabling a 
long-term response though transcriptional regulation. Given 
the downstream position of MYC, it is remarkable that Myc*'~ 
mice display changes throughout this nutrient and energy 
sensing network, all of which have been documented to promote 
longevity: increased AMPK activity, decreased AKT and S6K ac- 
tivities, and even reduced circulating levels of IGF-1 . None of the 
genes encoding components of these pathways (including IGF-1 
binding proteins) are regulated by MYC in our expression data 
sets. AMPK is likely to be regulated secondarily by the reduced 
energy status elicited by the higher metabolic rate and activity of 
Myc*'~ animals. Increased AMPK activity would then result in 
reduced TCR and S6K activities. The PI3K and TCR pathways 
also contain feedback loops that can be affected by MYC, 
for example, acute deletion of Myc was found to impair TCR 
signaling during T cell activation (Wang et al., 2011b). 

Considering the major longevity models, no set of phenotypes 
or pathways emerges as a unifying correlation. In other words, 
extended lifespan can be achieved by amelioration of some 
age-associated pathophysiologies even if others are left intact. 
Aging is increasingly being viewed as a segmental process 
(Wu et al., 2013), and the phenotypes of Myc hypomorphic ani- 
mals are consistent with this notion. In agreement, a meta-anal- 
ysis of gene expression showed limited overlap between Myc*'~ 
mice and CR, metformin, or resveratrol. The same analysis 
readily revealed previously reported connections between CR 
and metformin treatment (Martin-Montalvo et al., 2013) and 
underscores the apparently unique nature of the /Wyc-elicited 
lifespan extension. 

How can this observation be reconciled with the prominent 
position of MYC in nutrient sensing pathways? First, while 
MYC and TCR are both major regulatory hubs, their downstream 
outputs are only partially overlapping. For example, while TCR 
activity opposes autophagy (Johnson et al., 2013), MYC upregu- 
lates it (Dey et al., 2013). Second, we have not detected signifi- 
cant changes in stress levels, stress management pathways, 
or stress responses in Myc*'~ animals. These differences re- 
sulted in the low overlap scores returned by the meta-analysis, 
although these methods clearly detected commonalities in path- 
ways related to metabolism and the immune system. 

Production of RCS is well known to increase with age. 
Although manipulation of these pathways has not yet achieved 
robust lifespan extension in mammals (Salmon et al., 2010), a 
stress resistance signature is a component of several important 
longevity models (CR, reduced somatotrophic signaling, metfor- 
min treatment; Table S7). The XME pathway, which promotes the 
removal of ingested or endogenous toxic metabolites, has been 
strongly linked with longevity in invertebrate models (Gems and 
Partridge, 2013) and to a more limited extent in the mouse (Li 
et al., 2013). Although the expression of some XME genes in liver 
increased less with age in male Myc*'~ than Myc*'* mice, this 



effect was of much smaller magnitude than in other long-lived 
models. Furthermore, fibroblasts from Myc*'~ mice did not 
display increased resistance to a variety of chemical stressors. 
We looked for changes in the production of RCS (F 2 isopros- 
tanes), as well as some notable consequences of stress expo- 
sure (DNA double strand breaks, apoptosis, cellular senescence, 
p21 and pi 6 induction). Although we observed the expected 
age-associated increases, we did not observe any significant 
or consistent effects of the Myc*'~ genotype. In aggregate, these 
results indicate that Myc*'~ mice can achieve impressive gains in 
longevity without significant improvements in the efficacy of 
stress management pathways. 

Although it is possible that the phenotypes of Myc*'~ mice pre- 
dominantly originate from one cell type or tissue and are medi- 
ated in an endocrine manner, given the diverse and widespread 
nature of these effects we think this is unlikely. Furthermore, cell- 
autonomous effects are clearly apparent in cell culture. As floxed 
Myc alleles are available (de Alboran et al., 2001 ; Trumpp et al., 
2001), investigating the effects of tissue-specific reduction 
of MYC activity would be of considerable interest. Similarly, 
the relationship between lifespan and healthspan benefits and 
the level of MYC activity is not known. We tested a heterozygous 
(Myc*'~) condition that resulted in ~50% levels of MYC mRNA 
and protein. Given that Myc hypomorphs down to ~25% expres- 
sion are viable, it would be interesting to ask whether these 
animals continue to accrue healthspan benefits. 

A general theme emerging from aging studies in multiple spe- 
cies is that “less is better”: less food, less somatotrophic/IGF-1 
signaling, less anabolic activity, less protein translation, etc., and 
MYC certainly fits well in this paradigm. The majority of genes 
studied for their promotion of longevity act in signaling pathways. 
MYC, as a transcription factor, stands out because of the vast 
size of its regulome. The Myc gene is strongly upregulated by 
mitogens, and by virtue of its widespread targets constitutes 
the “gas pedal” that allows a cell to double its mass every 
18-24 hr. It is thus not surprising that MYC is necessary 
during embryonic development, and its deregulated expression 
strongly promotes cancer. Our discovery that long-term reduc- 
tion of MYC activity robustly extends lifespan greatly extends 
our understanding of the biology regulated by this fascinating 
gene. Given the variety of other physiological benefits, further 
studies of MYC and its targets should be an interesting avenue 
to explore in human medicine. 

EXPERIMENTAL PROCEDURES 
Use and Treatment of Animals 

Mice were produced and housed in a specific pathogen-free Association for 
Assessment and Accreditation of Laboratory Animai Care (AAALAC)-certified 
barrier faciiity. Ali procedures were approved by the Brown University institu- 
tional Animai Care and Use Committees (iACUC) committee. For longevity 
studies animais, were allowed to die naturally. Both females and males in 
the longevity study were virgins. Animals of both genotypes and the same 
sex were housed together. 

Statistical and Demographic Anaiysis 

Data are shown as means with SEM (unless stated otherwise). N indicates 
the number of animals per test group; age and sex are also noted. Student's 
t test (unpaired, two-tailed, equal variance) was used for all pairwise compar- 
isons. All relevant p values are shown in the figures; if not shown, the values 
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were >0.05. Demographic data were processed with JMP software to 
compute mean and median lifespans, SEM, percent increase of the median, 
and p values (log-rank test) for each cohort. Mortality rate was calculated as 
log {-log [survival]). Maximum lifespans were calculated as the proportion of 
each cohort still alive when the total population reached 90% mortality, using 
Fisher’s exact test to determine statistical significance. 

Procedures Performed on Live Animals 

To score vaginal patency, mice were examined daily from weaning until 
vaginal opening was observed. Reproductive longevity and output of males 
were determined by continuously housing young males of either genotype 
with females until pregnancies ceased. Females were replaced every 

6 months. For female fecundity, young females of either genotype were 
continuously mated with one male. For rotarod tests the apparatus 

(MedAssociates) was accelerated continuously from 4 to 40 revolutions per 
min. O2 consumption, CO2 production, food, and water intake were measured 
using the comprehensive lab animal monitoring system (CLAMS, Oxymax 
Open Circuit Calorimeter, Columbus Instruments). Spontaneous home cage 
activity was monitored using fully automated computer vision analysis of 
continuous video recordings. Wound-healing experiments were performed 
by introducing one 6 mm diameter full thickness skin punch on the back 
between the shoulder blades. 

Histological Analysis 

Mice were euthanized in the morning (8-10 a.m.) by isoflurane anesthesia fol- 
lowed by cervical dislocation. The dissection was performed rapidly (<3 min) 
by several trained staff members working in concert on one mouse. Paraffin- 
embedded specimens were stained with H&E. OCT-embedded specimens 
were used for the determination of fibrosis (Masson’s trichrome stain), lipid 
content (Oil Red O stain), and cellular senescence (senescence-associated 
p-galactosidase stain). Immunofluorescence microscopy was used to quantify 
MYC protein, cleaved caspase-3, 53BP1 foci, and macrophages. 

Analytical Procedures 

Individual gene expression was measured by qRT-PCR (SYBR Green system, 
ABI 7900 Fast Sequence Detection instrument). Total RNA was extracted from 
cells or tissue samples (20-50 mg, stored at — 80°C) with Trizol reagent (Invi- 
trogen), purified using the RNeasy Mini kit (QIAGEN), and reverse transcribed 
using the TaqMan kit (Applied Biosystems). Expression profiling was per- 
formed using Mouse 1.0 Gene ST arrays (Affymetrix), and analyzed using 
Expression Console software (Affymetrix) and Ingenuity Pathway Analysis 
software (Ingenuity Systems). rRNA was purified by affinity capture using the 
RiboMinus kit (Life Technologies). Chromatin immunoprecipitation was per- 
formed with the Magna ChIP kit (Millipore) on nuclei isolated from intact frozen 
liver tissue. Immunoblotting of proteins was performed using SDS-PAGE of 
whole cell extracts, electro-transfer onto Immobilon-P membranes (Millipore), 
and staining with the indicated antibodies. Signals were detected using the Ll- 
COR Odyssey (LI-COR Biosciences) infrared imaging system. Bone density 
and adipose tissue were assessed using a Scanco Medical Micro-CT 40 
instrument on whole animals immediately after euthanasia. Lipids were ex- 
tracted using the Folch method and cholesterol was measured with the 
Amplex Red Cholesterol kit (Invitrogen). F2 isoprostane levels were measured 
using gas chromatography and mass spectrometry. Leptin and adiponectin 
were quantified in plasma using the Luminex magnetic bead platform (Milli- 
pore) with the mouse adipokine panel and the adiponectin single-plex assay. 
Free IGF-1 in plasma was measured with an ELISA kit from Abeam. Lymphoid 
and myeloid cells in blood or bone marrow were quantified by flow cytometry. 
Blood was harvested by cardiac puncture into heparinized tubes, cells were 
collected by centrifugation, and erythrocytes were lysed with FACS lysing 
solution (Becton Dickinson). Bone marrow was flushed out of the femur and 
tibia. In both cases the cell suspensions were stained with the indicated fluo- 
rescently conjugated antibodies and analyzed using a Becton Dickinson 
FACSAria instrument. To measure AMP, ADP, and ATP, frozen tissue speci- 
mens were homogenized (Fisher Scientific, PowerGen Model 125), extracted 
in perchloric acid, and neutralized with K2CO3 on ice. Ion-paired HPLC was 
run on an Agilent 1200 series instrument using a ZORBAX Eclipse XDB-C18 
column (Agilent). To assess protein translation in live animals, L-®H-phenylal- 



anine (Perkin-Elmer, 100-140 Ci/mmol) was combined with unlabeled 
phenylalanine (135 mM), and injected at 1 ml/100 g body weight into the tail 
vein. Animals were kept under anesthesia for the entire duration (0-30 min), 
sacrificed by cervical dislocation, and tissues were snap frozen in liquid nitro- 
gen. Incorporation of ®H-phenylalanine was measured by TCA precipitation of 
total protein and scintillation counting. Detailed protocols are provided in the 
Supplemental Information. 
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SUMMARY 

Protein kinase C (PKC) isozymes have remained 
elusive cancer targets despite the unambiguous tu- 
mor promoting function of their potent ligands, phor- 
bol esters, and the prevalence of their mutations. We 
analyzed 8% of PKC mutations identified in human 
cancers and found that, surprisingly, most were loss 
of function and none were activating. Loss-of-func- 
tion mutations occurred in all PKC subgroups and 
impeded second-messenger binding, phosphory- 
lation, or catalysis. Correction of a loss-of-function 
PKCP mutation by CRISPR-mediated genome editing 
in a patient-derived colon cancer cell line suppressed 
anchorage-independent growth and reduced tumor 
growth in a xenograft model. Hemizygous deletion 
promoted anchorage-independent growth, revealing 
that PKCp is haploinsufficient for tumor suppression. 
Several mutations were dominant negative, suppress- 
ing global PKC signaling output, and bioinformatic 
analysis suggested that PKC mutations cooperate 
with co-occurring mutations in cancer drivers. These 
data establish that PKC isozymes generally fun- 
ction as tumor suppressors, indicating that therapies 
should focus on restoring, not inhibiting, PKC activity. 

INTRODUCTION 

The protein kinase C (PKC) famiiy has been intenseiy investi- 
gated in the context of cancer since the discovery that it is a re- 
ceptor for the tumor-promoting phorboi esters (Castagna et ai., 
1982). This ied to the dogma that activation of PKC by phorboi 
esters promotes carcinogen-induced tumorigenesis (Griner 
and Kazanietz, 2007), yet targeting PKC in cancer has been 
unsuccessfui. 

The PKC famiiy contains nine genes that have many targets 
and thus diverse ceiiuiar functions, inciuding ceii survivai, prolif- 
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eration, apoptosis, and migration (Dempsey et al., 2000). PKC 
isozymes comprise three ciasses: conventionai (cPKC: a, p, y), 
novei (nPKC: 8, e, t|, 9), and atypicai (aPKC: X,, i). cPKC and 
nPKC isozymes are constitutiveiy phosphoryiated at three prim- 
ing sites (activation ioop, turn motif, and hydrophobic motif) to 
structure PKC for cataiysis (Newton, 2003). A pseudosubstrate 
segment maintains PKC in an autoinhibited conformation that 
is reiieved by second-messenger binding. cPKC isozymes are 
activated by binding to diacyiglyceroi (DAG) and Ca®*, whereas 
nPKC isozymes are activated soiely by DAG, events that engage 
PKC at membranes. Thus, these PKC isozymes have two pre- 
requisites for activation: constitutive processing phosphoryia- 
tions and second-messenger-dependent reiocaiization to mem- 
branes. Proionged activation of cPKC and nPKC isozymes with 
phorboi esters ieads to their dephosphoryiation and subsequent 
degradation, a process referred to as downreguiation (Hansra 
et ai., 1996; Young et ai., 1987). aPKC isozymes bind neither 
Ca®* nor DAG. 

PKC has proved an intractabie target in cancer therapeutics 
(Kang, 2014). PKCi was proposed to be an oncogene in iung 
and ovarian cancers (Justiiien et ai., 2014; Regaia et al., 2005; 
Zhang et ai., 2006), and PKCe was categorized as an oncogene 
because of its abiiity to transform ceiis (Cacace et ai., 1993). 
However, for most PKC isozymes, there is confiicting evidence 
as to whether they act as oncogenes or as tumor suppressors. 
For exampie, PKC5 is considered a tumor suppressor because 
of its pro-apoptotic effects (Reyiand, 2007). However, it pro- 
motes tumor progression of iung and pancreatic cancers in 
certain contexts (Mauroet ai., 2010; Symonds et ai., 2011). Simi- 
iariy, both overexpression and ioss of PKCi; in coion cancer ceils 
have been reported to decrease tumorigenicity in nude mice or 
ceii iines, respectively (Luna-Ulloa et al., 201 1 ; Ma et al., 2013). 
Likewise, PKCot was reported to both induce (Walsh et al., 
2004; Wu et al., 2013) and suppress colon cancer cell prolifera- 
tion (Gwaket al., 2009) and to suppress colon tumor formation in 
the APC’'^'"’^’' model (Oster and Leitges, 2006). Based on the 
dogma that PKC isozymes contribute positively to cancer pro- 
gression, many PKC inhibitors have entered clinical trials; how- 
ever, they have been ineffective (Mackay and Twelves, 2007). 
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#TCGA cases with 
mutations 
PKCa 50 

PKCP 90 

PKCy 102 

PKC6 47 

PKCe 57 

PKCn 51 

PKC0 81 

PKC^; 28 

PKCi 48 

Total 554 

Figure 1. A Multitude of Cancer-Associated Mutations Have Been 
Identified within the Nine PKC Genes 

(Left) Domain structure of conventional (a, p, y), novel (6, e, t), 0), and atypical 
(i:^, l) PKC members showing priming phosphorylation sites: activation loop 
(pink), turn motif (orange), and hydrophobic motif (green). (Right) Number of 
TCGA cases with cancer-associated mutations (missense, nonsense, in- 
sertions, deletions, splice site, or translation start site) identified within each of 
the PKC genes. 

In fact, a recent meta-analysis of controlled trials of PKC Inhibi- 
tors combined with chemotherapy versus chemotherapy alone 
revealed that PKC Inhibitors significantly decreased response 
rates and disease control rates In non-small cell lung cancer 
(Zhang et al., 2014). Why has inhibiting PKC failed In the clinic? 
It has been well established that prolonged or repetitive treat- 
ment with phorbol esters depletes cPKC and nPKC isozymes 
from cells (Blumberg, 1980; Nelson and Alkon, 2009), bringing 
Into question whether loss of PKC, rather than Its activation, pro- 
motes tumorigenesis. 

PKC Is frequently mutated In human cancers. To uncover 
whether loss or gain of PKC function contributes to cancer pro- 
gression, we selected mutations throughout the primary se- 
quence and family membership and assessed their functional 
Impact. Specifically, we asked how these cancer-associated 
mutations alter the signaling output of PKC using our genetically 
encoded reporter, C kinase activity reporter (CKAR) (Violin et al., 
2003). Characterization of 46 of these mutations revealed that 
most reduced or abolished PKC activity and none were acti- 
vating. Bioinformatic analysis of all PKC mutations revealed 
that they may cooperate with co-occurring mutations in onco- 
genes and tumor suppressors known to be regulated by PKC. 
Correction of one patient-identified, heterozygous, loss-of-func- 
tion (LOF) PKCP mutation in a colon cancer cell line significantly 
decreased tumor size in mouse xenografts, indicating that loss of 
PKC function enhances tumor growth. Our data are consistent 
with PKC isozymes functioning generally as tumor suppressors, 
reversing the paradigm that their hyperactivation promotes tu- 
mor growth. 

RESULTS 

A Multitude of Cancer-Associated Mutations Have Been 
Identified within the Nine PKC Genes 

554 mutations (as of October 2014), of which most are heterozy- 
gous, have been identified in diverse cancers (Cerami et al., 
2012; Gao et al., 2013) within cPKC (242), nPKC (236), and 
aPKC (76) isozymes (Figure 1). These mutations reside through- 
out the entire coding region, with no apparent mutational hot- 
spots. Therefore, we conducted a comprehensive study of 



mutations within PKC domains and within interdomain regions 
to determine how they affect PKC signaling to contribute to 
cancer pathogenesis. 46 mutations of both conserved and 
non-conserved residues were selected from all three classes 
of PKC isozymes (Table 1 and Table SI). 

PKC Mutations in the Regulatory Cl and C2 Domains 
Are LOF 

The Cl domains of cPKC and nPKC Isozymes are critical for their 
activation because they mediate PKC translocation to mem- 
branes via binding to DAG. Thus, we Investigated how Cl 
domain mutations alter PKC translocation and activation. To 
measure agonist-dependent PKC activity, COST cells co-ex- 
pressing the FRET-based PKC reporter (CKAR) and equal levels 
of either wild-type (WT) or mutant mCherry-tagged PKC were 
stimulated with the cell-permeable DAG, DIC8, or the phorbol 
ester, phorbol 12,13-dlbutyrate (PDBu), and phosphorylation- 
dependent FRET ratio changes were recorded. Phorbol esters 
serve as an effective although non-physlological tool to maxi- 
mally activate PKC because they bind with 1 00-fold higher affin- 
ity to Cl domains compared to DAG (Mosior and Newton, 1 998). 
A mutation identified in a colorectal cancer tumor altered a res- 
idue (PKCa H75Q) required for coordination of Zn^"^ and thus 
for folding of the Cl domain (Figure 2A). This mutation ablated 
agonist-stimulated activity, as evidenced by a lower FRET ratio 
trace compared with that of cells containing only endogenous 
PKC (Figure 2B). This lower activity suggests that the mutant is 
dominant negative toward global PKC output. Within a head 
and neck cancer patient, a mutation altered a critical residue 
(PKCa W58L) required for controlling the affinity for DAG, but 
not phorbol ester (Dries et al., 2007) (Figure 2A). This mutation 
also abolished DIC8-lnduced and basal activity but retained 
some PDBu-induced activity, consistent with this residue selec- 
tively regulating DAG affinity (Figures 2B and SI A). Because 
membrane translocation Is a prerequisite for activation of 
cPKC Isozymes, we compared the translocation of YFP-tagged 
WT and mutant PKC to membrane-targeted CFP using FRET 
(Antal et al., 2014). Mutation of either residue impaired transloca- 
tion upon stimulation with DiC8, phorbol ester (Figure 2C), or 
the natural agonist DTP (Figure 2D), accounting for the inability 
of these agonists to activate the mutants. Lastly, we asked 
how these mutations affected the processing phosphorylations 
of PKC. PKCa FI75Q, but not W58L, was unphosphorylated, 
likely because the misfolded CIA domain of the H75Q 
mutant prevented Its processing (Figure 2E). Three additional 
mutations within the CIA domains of PKCa (G61W), PKCp 
(G61W), and PKCy (Q62FI) also exhibited reduced agonist- 
induced PKC activity (Figures SI B-S1 D). Our analysis of nine 
Cl domain mutations revealed that five reduced or abolished 
activity while none were hyperactivating (Tables 1 and SI). Inac- 
tivation occurred by altering two key inputs required for 
PKC function: disruption of binding to DAG or processing by 
phosphorylations. 

The C2 domain of cPKC isozymes is also critical for activation, 
as It mediates Ca^"^-dependent pre-targeting to plasma mem- 
brane, where these Isozymes bind DAG and become activated 
(Newton, 2003). One mutation identified within the C2 domain 
of PKCy (D1 93N) was present in colorectal and ovarian cancers 
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Table 1. Loss-of-Function PKC Mutations in Cancer 



Mutation'^ 


Activity 


Domain 


Canc 0 r(s) 


Residue Importance 


Allele Frequency 


Other Mutations*’ 


y G23E 


none° 


PS 


colorectal 


adding negative charge to 
pseudosubstrate 


N/A 


y G23W 
5G146R 
iG128C 


E R162H 


low 




head and neck 


non-conserved 


0.15 




aW58L 


none"^ 


CIA 


head and neck 


DAG binding; conserved in all 
Cl a domains 


0.22 


y W57splice 
e W171* 


aG61W 


low 




lung 


conserved in cPKC Cl a domains 


0.05 


PG61W 


PG61W 


low 




lung 


conserved in cPKC Cl a domains 


0.06 


aG61W 


y Q62H 


none° 




lung 


conserved in all PKC isozymes 


0.45 


aQ63H 

eQ197P 


aH75Q 


0000“^ 




colorectal 


coordinates Zn^*; conserved in 
all Cl domains 


N/A 


T| H284Y 
i H179Y 


y D193N 


none° 


C2 


colorectal/melanoma/ 

ovarian 


Ca^* binding site 


0.28 




y T218M 


none° 




stomach 


non-conserved 


0.42 


y T218R 


y D254N 


low 




endometrial/ovarian 


Ca^* binding site 


0.43 




a G257V 


0000“^ 




lung 


conserved in cPKC isozymes 


0.12 




y F362L 


non0° 


Kinas 0 


endometrial 


conserved in cPKC and nPKC 
isozymes 


0.21 


y F362fs 
P F353L 


PY417H 


0000“^ 




liver 


conserved in cPKC isozymes 


0.67 


y Y431 F 


C E421 K 


0000“^ 




breast 


APE motif; conserved in most 
protein kinases 


N/A 


a E508K 
i E423D 


a F435C 


non0° 




endometrial 


conserved in cPKC and nPKC isozymes 


0.31 




a A444V 


low 




endometrial/breast 


conserved in cPKC and nPKC 
isozymes 


0.27 


P A447T 
y A461T 
y A461V 
5 A454V 
e A485T 
i S359C 


y G450C 


nono"^ 




endometrial/lung/liver 


conserved in cPKC isozymes 


0.41 


E R502* 


a D481 E 


low 




colorectal 


DFG motif; conserved in mosf 
profein kinases 


N/A 


P D484N 
y D498N 
i D396E 


P A509V 


nono'' 




breast 


APE mofif; conserved in most 
protein kinases 


N/A 


a A506V 
a A506T 
P A509T 


P A509T 


non0° 




colorectal 


APE motif; conserved in most 
protein kinases 


0.53 


a A506V 
a A506T 
P A509V 


y P524R 


nono'^ 




pancreatic 


APE motif; conserved in most 
protein kinases 


N/A 


y P524L 
5 P517S 
E P576S 
e P548S 


5 D530G 


nono'' 




colorectal 


anchors the conserved 
regulatory spine; conserved in 
all eukaryotic kinases 


N/A 


P D523N 
y D537G 
y D537Y 


5 P568A 


non 0 ° 




head and neck 


conserved in all PKC isozymes 


0.16 


6 P568S 
P P561 H 
y P575H 


P G585S 


low 




lung 


conserved in all PKC isozymes 


N/A 


11 G598V 


T| K591 E 


low 




breast 


reversal of conserved charge 


N/A 


T| K591 N 
e R616Q 


Ti R596H 


nono'' 




colorectal 


conserved in all PKC isozymes 


0.50 




T| G598V 


nono'^ 




lung 


conserved in all PKC isozymes 


N/A 


P G585S 



{Continued on next page) 
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Mutation^ 


Activity 


Domain 


Cancer(s) 


Residue Importance 


Alleie Frequency Other Mutations'^ 


P P619Q 


none'^ 


C-tail 


endometrial 


PXXP motif; conserved in AGC kinases 


0.48 



PKC mutations showing no activity with any agonist, no activity with physioiogicai stimuii, or reduced activity in response to physiologicai stimuli. Aiiele 
frequencies were obtained from cBioPortai. 

®Mutations examined in this study. 

^Other mutations present at the same/corresponding residue in the same/other PKC isozymes. 

°Kinase-dead. 

"^No response to physioiogicai stimuii. 



and in melanoma. Another (D254N) was found in endometrial 
and ovarian cancers. Because both of these Asp residues (Fig- 
ure 2F) coordinate Ca^"^ (Medkova and Cho, 1 998), we monitored 
their activation upon elevation of intracellular Ca^"^ with thapsi- 
gargin, a sarco/endoplasmic reticulum Ca^'^-ATPase inhibitor 
(Rogers et al., 1995). In contrast to WT PKCy, neither mutant 
was activated (Figure 2G) nor translocated to the plasma mem- 
brane (Figure 2FI) following thapsigargin addition, consistent 
with impaired Ca^"^ binding. Fiowever, both mutants retained 
full responses to phorbol esters, consistent with unimpaired Cl 
domains. To further substantiate the inability of the mutants to 
bind Ca^"^, we monitored PKC oscillatory translocation stimu- 
lated by histamine-induced oscillatory Ca^"^ release in HeLa cells 
(Violin et al., 2003). Whereas WT PKCy exhibited oscillatory 
translocation in some cells, the C2 domain mutants were unre- 
sponsive to histamine (Figure 21). Thus, these C2 domain muta- 
tions dampen PKCy activity because they impede Ca^"^ binding. 
Mutation of two other C2 domain residues that are not directly 
involved in Ca^"^ binding (PKCy T218M and PKCa G257V) also 
caused LOF (Figure SID and S1E); PKCa G257V was LOF 
because it was not processed by phosphorylation (Figure S1F), 
whereas the remaining C2 domain mutants were (data not 
shown). Our analysis of six C2 domain mutations revealed four 
LOF mutations and no hyperactivating ones (Tables 1 and SI). 

PKC Mutations in the Kinase Domain Are LOF 

We next evaluated 21 kinase domain mutations, two of which 
were within PK05: D530G in colorectal cancer and P568A in 
head and neck cancer (Figure 3A). Asp530 functions as an an- 
chor for the kinase regulatory spine, a highly conserved struc- 
tural element of eukaryotic kinases (Kornev et al., 2006; Kornev 
et al., 2008); not surprisingly, the D530G mutant was kinase 
dead and not primed by phosphorylation (Figures 3B and 3C). 
Mutation of the conserved Pro568 to Ala also prevented a 
response to natural agonist stimulation but maintained some 
PDBu-stimulated activity, likely because a small pool of this 
mutant was phosphorylated (Figures 3B and 3C). 

Strikingly, all three PKCri mutations examined (K591 E, R596Fi, 
and G598V) altered its subcellular localization by pre-localizing it 
at the plasma membrane prior to stimulation (Figure 3D). Fiow- 
ever, despite constitutive membrane association, these mutants 
had reduced basal and stimulated activity as read out by a phos- 
pho-(Ser) PKC substrate antibody (Figure 3E) because they were 
not processed by phosphorylation (Figure 3F). We have previ- 
ously shown that unprocessed nPKC isozymes have exposed 
Cl domains that induce constitutive membrane association 
(Antal etal., 2014). 



A number of mutations were present within the highly 
conserved APE motif that is involved in substrate binding and 
allosteric activation of kinases (Kornev et al., 2008). PKCy 
P524R and PKCP A509V mutations ablated activity by prevent- 
ing processing phosphorylations, and both exhibited domi- 
nant-negative roles (Figures 3G-3J). PKCp A509T (colorectal 
cancer) also showed loss of function in response to DTP but 
was modestly activated by the potent ligand PDBu (Figure 31), 
likely because a small pool of it was phosphorylated (Figure 3J). 
A LOF mutation that prevented processing of the atypical PKC!( 
was also found within the APE motif (E421K; Figure S1G). 

Further analysis revealed that 16 out of 21 kinase domain mu- 
tations that we analyzed (Tables 1 and SI ) resulted in full or partial 
LOF, with the majority preventing processing by phosphorylation. 
For example, PKCot F435C, PKCa A444V, PKCpIl Y417H, PKCpIl 
G585S, and PKCy G450C had impaired phosphorylation and 
reduced activity (Figures SI C-S1F and S1H-S1 J). Fiowever, par- 
tial LOF mutations were also observed in cases in which phos- 
phorylation was maintained — PKCa D481E (Figures SIB and 
S1F) and PKCy F362L (Figures SID and S1J), suggesting that 
these mutations likely decrease PKC’s intrinsic catalytic activity. 

The Majority of Cancer-Associated PKC Mutations 
Are LOF 

Our analysis of 46 mutations present within eight of the PKC 
genes revealed that ~61% (28) of them were LOF and none 
were activating (Figure 4A). A lack of identification of activating 
mutations is not an artifact of our assays, as activating PKC mu- 
tations that increase PKC affinity for DAG or decrease autoinhibi- 
tion are readily detectable (data not shown). LOF mutations were 
identified within cPKC (a, p, y), nPKC (6, e, q), and aPKO (?) iso- 
zymes and occurred within the Cl, C2, and kinase domains as 
well as the pseudosubstrate and C-terminal tail (Figure 4B). For 
example, the PKCy G23E pseudosubstrate mutation was not 
processed by phosphorylation (Figure S1J) and thus lacked any 
UTP-stimulated activity (Figure SI D), and the PKCe R1 62FI pseu- 
dosubstrate mutation showed reduced agonist-stimulated and 
basal activity (Figures SI K and SI L). The PKCp P61 90 C-terminal 
tail mutation, residing within a conserved PXXP motif required for 
processing (Gould et al., 2009), was also LOF as it prevented PKC 
phosphorylation (Figure SIR). Overall, PKC LOF occurred by 
diverse mechanisms, most commonly by preventing processing 
phosphorylations or ligand binding, and as such, there were no 
mutational hotspots for loss of function. Fiowever, we identified 
seven LOF mutation “warmspots” (Sun etal., 2007) that fell within 
highly conserved regions of PKC— one within the pseudo- 
substrate and six within the kinase domain (Figure 4C). Thus, 
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Figure 2. PKC Mutations in the Regulatory Cl and C2 Domains Are 
LOF 

(A) Soiution structure of the CIA domain of PKCy (PDB 2E73) showing the 
corresponding PKCa His75 residue that coordinates Zn^'^' and PKCa Trp58. 

(B) Normaiized FRET ratio changes (mean ±SEM) representing DiC8- (10 pIVI) 
foilowed by PDBu- (200 nIVI) induced PKC activity as read out by CKAR in 
COS7 ceiis co-expressing CKAR and either mCherry-tagged WT, mutant 
PKCa, or no exogenous PKC (endogenous). 

(C) (Left) Representative YFP images of the indioated PKC isozymes under 
basal and PDBu-treated conditions (200 nM; 15 min) showing relocalization of 



inactivating mutations targeted conserved regulatory elements 
and frequently hit the same residue, whereas mutations that 
exhibited no difference from WT occurred more randomly 
[Table S1). 

Analysis of cancer types most frequently harboring PKC muta- 
tions revealed that, although PKC Isozymes are mutated across 
many cancers, PKC mutations are enriched In certain cancers 
(Figure 4D). Namely, PKC Isozymes are mutated in 20%-25% 
of melanomas, colorectal cancers, or lung squamous cell carci- 
nomas but are mutated in <5% of ovarian cancers, glioblastoma, 
or breast cancers (Cerami et al., 2012; Gao et al., 2013). Addi- 
tionally, nPKC isozymes are most commonly mutated in gastro- 
intestinal cancers (pancreatic, stomach, and colorectal), which 
have a lower mutation burden than melanomas and lung can- 
cers, highlighting their importance in this type of cancer (Fig- 
ure 4D). The majority of PKC mutations are heterozygous, with 
an allele frequency varying from 0.05 to 0.67 for the mutations 
characterized (Tables 1 and SI). This indicates that PKC muta- 
tions can be truncal events in regards to tumor heterogeneity 
and exist in a majority of the cells within a tumor or can be bran- 
chal events acquired later in tumorigenesis as the tumor pro- 
gresses to a more aggressive stage. This is consistent with 
PKC mutations being co-driver events that enhance tumorigen- 
esis mediated by primary drivers. 

Dominant-Negative PKCp Mutation Confers a Tumor 
Growth Advantage 

Because the majority of PKC mutations examined were LOF, 
we tested whether we could rescue FICT1 1 6 colon cancer cells 
that have a heterozygous LOF frameshift mutation in the 02 
domain of PKCp by overexpressing WT PKCpil. This resulted 
in a dramatic reduction in anchorage-independent growth (Fig- 
ure S2A), a hallmark of cellular transformation. Thus, we next 
used CRISPR/Cas9-mediated genome editing to ask whether 



WT, but not mutant PKCa, to membranes. (Right) Normalized FRET ratio 
changes (mean ±SEM) quantifying translocation of YFP-tagged PKCa pro- 
teins toward a membrane-targeted CFP upon stimulation with 10 ^lM DIC8, 
followed by 200 nM PDBu. 

(D) Normalized FRET ratio changes (mean ±SEM) showing PKC translocation 
following DTP (100 |.iM) stimulation. 

(E) Immunoblot showing the phosphorylation state of the indicated YFP-tag- 
ged PKCa proteins. 

(F) Crystal structure of the C2 domain of PKCy (PDB 2UZP) highlighting 
Asp193 and Asp254 residues involved in Ca^"^ binding. 

(G) Normalized FRET ratio changes (mean ±SEM) showing PKC activity as 
read out by CKAR upon elevation of intracellular Ca^"^ stimulated by thapsi- 
gargin (5 |tM), followed by PDBu (200 nM). 

(H) Normalized FRET ratio changes (mean ±SEM) showing translocation of 
YFP-tagged PKCy constructs toward membrane-localized CFP upon stimu- 
lation of COS7 cells with thapsigargin (5 jiM) followed by PDBu (200 nM). Data 
were normalized to the maximal amplitude of translocation for each cell and 
then scaled from 0 to 1 using the equation: X = (Y — Ymin)/(Ymax — Ymin), 
where Y = normalized FRET ratio, Ymin = minimum value of Y, and Ymax is 
maximum value of Y. 

(I) Normalized FRET ratio changes displaying oscillatory translocation of YFP- 
tagged WT PKCy, but not PKCy mutants D1 93N and D254N, in HeLa cells co- 
expressing membrane-targeted CFP and stimulated with 10 jiM histamine. 
Data are representative traces from individual cells of three independent 
experiments. 

See also Figure SI . 
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Figure 3. PKC Mutations in the Kinase 
Domain Are LOF 

(A) Crystal structure of the kinase domain of PKCpil 
(PDB 2I0E) highiighting cancer-associated resi- 
dues and the regulatory spine (yellow space filling). 

(B) Normalized FRET ratio changes (mean ±SEM) 
showing PKC activity of PKC5 constructs in COS7 
cells co-expressing the plasma membrane-tar- 
geted, PKC6-specific reporter PM-6CKAR. Cells 
were stimulated with UTP (100 |.iM) followed by 
PDBu (200 nM). 

(C) Immunoblot analysis of the phosphorylation 
state of PKC5 WT and mutants. 

(D) Representative mCherry images of mCherry- 
tagged PKCt) WT or mutants showing localization 
under basal conditions and 15 min post 200 nM 
PDBu addition to COS7 cells. 

(E) (Left) Immunoblot showing PKC substrate 
phosphorylation. COS7 cells overexpressing the 
indicated constructs were pre-treated with 4 ).iM 
G56976 for 10 min to inhibit cPKC isozymes and 
were then stimulated or not with 200 nM PDBu to 
activate nPKC isozymes. (Right) Immunoblots 
were quantified and normalized to total PKCt) 
levels and tubulin. Data represent averages of 
three independent experiments ±SEM. Compari- 
sons for basal and stimulated activity were made 
using a repeated-measures one-way ANOVA fol- 
lowed by post hoc Dunnett’s multiple comparison 
test. *p < 0.05 as compared with the WT group. 

(F) Immunoblot analysis of the phosphorylation 
state of mCherry-tagged PKCt] WT and mutants. 

(G) Normalized FRET ratio changes (mean ± SEM) 
showing PKC activity from COS7 cells co-ex- 
pressing CKAR and RFP-tagged PKCy mutants 
stimulated with 200 nM PDBu. 

(H) Immunoblot depicting PKCy WT and P524R 
phosphorylation. The asterisk denotes phosphor- 
ylated and the dash unphosphorylated PKCy. 

(I) Normalized FRET ratio changes (mean ±SEM) 
showing PKC activity of PKCpil constructs in COS7 
cells co-expressing CKAR. Cells were stimulated 
with UTP (100 iiM) followed by PDBu (200 nM). 

(J) Immunoblot depicting mCherry-tagged PKCpil 
WT and mutant phosphorylation. The asterisk 
denotes phosphorylated and the dash unphos- 
phorylated PKCpil. 

See also Figure SI . 
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reverting an endogenous LOF allele to WT would also rescue cell 
growth. We used DLD1 colon cancer cells because they harbor a 
PKCp A509T LOF mutation (Figure 31) to assess whether a het- 
erozygous LOF PKO mutation could confer a survival advantage, 
as most cancer-associated PKO mutations are heterozygous. 
We reverted the mutation to WT in three isogenic clones (Figures 
S2B and S20) and confirmed that no sequence alterations ex- 
isted within the top two most likely predicted off-targets (data 
not shown). Correction of the A509T mutation in the endogenous 
PKCp [PRKCB] allele caused a slight but reproducible increase 



in the PKCp levels and a >2-fold increase 
in PKCa levels, although neither reached 
statistical significance (Figure 5A). Immu- 
noblot analysis with a phospho-(Ser) PKC substrate antibody re- 
vealed significantly higher basal PKC activity in the corrected 
cells (Figure 5B). This is consistent with the DLD1 parental cells 
having reduced PKC activity because of the LCF PKCp mutation 
and the lower PKCa levels. We next tested the ability of these 
cells to grow in suspension. Consistent with having higher PKC 
activity and a more tumor-suppressive phenotype, the corrected 
cells were less viable in suspension (Figure 5C) because they 
were less capable of forming the compact multicellular aggre- 
gates formed by the DLD1 parental cells (Figure 5D). Moreover, 
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Figure 4. The Majority of PKC Mutations Are LOF 

(A) Pie chart of the functional impact of the investigated PKC mutations, with bright red representing mutations that lack any activity, medium red representing 
mutations that show no response to physiological stimuli (DAG or Ca^"^ elevation) but some response to non-physiological phorbol esters, light red representing 
mutations that display reduced activity to physiological stimuli compared to the corresponding WT isozyme, and blue representing no difference from the 
corresponding WT PKC isozyme. 

(B) Domain structure of cPKC, nPKC, and aPKC isozymes, overlaid with the LOF mutations color coded by isozyme. 

(C) Crystal structure of the kinase domain of PKCpil (PDB 2I0E) highlighting “warmspot” residues mutated in at least four tumor samples within the various PKC 
isozymes. 

(D) Bar graph depicting the percentage of mutations distributed in the indicated cancers for each PKC isozyme. 



the corrected clones had decreased anchorage-independent 
growth potential (Figure 5E). These results corroborate those ob- 
tained from the HCT116 cells overexpressing PKCpil, demon- 
strating that partial loss of PKCP activity is necessary for growth 
in soft agar. However, in a 2D proliferation assay, the DLD1 -cor- 
rected cells proliferated at similar rates to the DLD1 parental cells 
(Figure S2D), indicating that it is not the proliferation rates that 
differ between these cells but, rather, their ability to grow in the 
absence of anchorage. 

To determine whether PKC displays haploinsufficiency, we 
knocked out the mutant PKCp allele in DLD1 cells by creating 
a frameshift deletion using genome engineering (Figure S2E). 
This hemizygous clone (WT/- 23), containing only one WT allele 
and thus expressing lower PKCpil levels (Figure S2F), exhibited 
significantly increased anchorage-independent growth potential 
compared to cells containing two WT alleles, indicating that 
PKCpil is haploinsufficient for tumor suppression (Figure 5E). 
Additionally, the PKCp hemizygous cells did not grow as well 



as the PKCp A509T mutated cells in soft agar, indicating that 
this mutation had a dominant-negative effect. 

To definitively establish whether a heterozygous LOF PKCP 
mutation facilitates tumor growth in vivo, the DLD1 parental or cor- 
rected cells were subcutaneously injected into the flanks of nude 
mice and tumor growth was monitored. Consistent with our 
cellular data, the tumors derived from the corrected cells were 
significantly smaller than those from the DLD1 parental cells (Fig- 
ures 5F and S2G). This reduced growth correlated with increased 
apoptosis as assessed by TUNEL staining of tumor sections (Fig- 
ure 5G). These data demonstrate that a heterozygous, dominant- 
negative PKCp mutation can significantly increase tumor growth, 
thus establishing PKCp as a tumor suppressor. 

DISCUSSION 

Here we establish that clinical trials targeting PKC have been 
based on the wrong assumption; it is not inactivation of PKC 
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but, rather, activation that suppresses tumor growth. Thus, we 
propose that therapies shouid target mechanisms to restore the 
PKC signaiing output rather than reduce it. Our comprehensive 
analysis revealed that 61 % of the PKC mutations characterized 
were LOF and none were activating. We did not account for 
nonsense mutations or deletions, so an even higher proportion 
of PKC mutations are LOF. Corroborating our data, three other 
LOF PKC mutations have been previousiy described. A LOF 
PKCa mutation (D294G in C2 domain) was identified in three 
types of cancer (Aivaro et ai., 1993; Prevostei et ai., 1997; Zhu 
et ai., 2005) and a LOF PKC? mutation (S514F in the kinase 
domain) was identified in colorectai cancer (Galvez et al., 2009). 
A partial LOF mutation in PKCt (R471C), present in three distinct 
cancers, disrupted substrate binding and induced abnormal 
epithelial polarity (Linch et al., 2013). To our knowledge, no 
gain-of-function PKC mutations have been observed in cancer. 
The identification of LOF mutations throughout the PKC famiiy 
and in diverse cancers supports a generai role for PKC isozymes 
as tumor suppressors. 

Strikingiy, several LOF PKC mutations (e.g., PKCP A509V, 
PKCy P524R, and PKCa W58L, H75C, and G257V) acted in a 
dominant-negative manner by decreasing giobai endogenous 
PKC activity. Moreover, the presence of mutant PKCp A509T 
protein in DLD1 ceiis reduced PKCa ieveis. One mechanism 
for this cross-PKC dominant-negative effect is that the LOF 
PKC impairs the priming phosphoryiations of other PKCs, thus 
reducing their steady-state Ieveis. This is supported by a prior 
study demonstrating that unprocessed kinase-dead PKC iso- 
zymes prevent the phosphoryiation of other PKC isozymes, 
iikely because their phosphoryiation requires common titratabie 
components (Garcia-Paramio et ai., 1998). This dominant-nega- 
tive roie of LOF mutations is corroborated by studies showing 
that kinase-dead PKC isozymes function in a dominant-nega- 
tive manner to exhibit tumorigenic effects on ceiis (Galvez 
et al., 2009; Flirai et al., 1994; Kim et al., 2013; Lu et al., 
1997). Importantly, although some PKC mutations were domi- 
nant negative, loss of PKC such as would occur from nonsense 
mutations or gene deietions aiso conferred a growth advantage 
(Figure 5E), indicating that PKC is hapioinsufficient for tumor 
suppression. 



A tumor-suppressive roie of PKC is supported by PKC gene 
knockout mouse modeis and celiular studies. PKCa-deficient 
{Prkca^'^) mice deveioped spontaneous intestinal tumors (Cster 
and Leitges, 2006). In an aPC'^'"^’^ background, loss of PKCa 
induced more aggressive tumors and decreased survivai (Cster 
and Leitges, 2006), and in the context of oncogenic Kras, PKCa 
deietion increased iung tumor formation (Flili et al., 2014). Dele- 
tion of PKC? in mice that are PTEN hapioinsufficient resuited in 
iarger, more invasive prostate tumors and enhanced intestinai 
tumorigenesis in an aPC’'^'"'"^ background (Ma et ai., 2013). 
Knockdown of PKCS in coion cancer ceiis increased tumor 
growth in nude mice (Flernandez-Maqueda et al., 2013). 
Conversely, overexpression of PKC reveaied a protective roie. 
Re-expression of PKCpi in coion cancer ceiis (Choi et al., 
1990) or of PKCS in keratinocytes (D’Costa et ai., 2006) or over- 
expression of PKC? in coion cancer ceiis (Ma et ai., 2013) or in 
Ras-transformed fibroblasts (Galvez et al., 2009) decreased 
tumorigenicity in nude mice. 

Ciinicai data reveal lower PKC protein ieveis and activity in tu- 
mor tissue compared with cognate normai tissue, also support- 
ing a tumor-suppressive roie for PKC. Totai PKC activity was 
significantiy iower in human coiorectai cancers versus normai 
mucosa because of decreased PKCp and PKCS (Craven and 
DeRubertis, 1994) or PKCp and PKCs protein ieveis (Pongracz 
et ai., 1995). PKCa protein was downreguiated in 60% of human 
coiorectai cancers (Suga et al., 1998), and PKC? was downregu- 
iated in renai celi carcinoma (Pu et ai., 2012) and non-smail celi 
iung cancer (Gaivez et ai., 2009). Decreased PKCp and PKCS 
ieveis correiated with increased tumor grade in biadder cancer 
(Keren et ai., 2000; Langzam et ai., 2001; Varga et ai., 2004), 
and decreased PKCS ieveis correlated with increased grade in 
endometriai cancer and giioma (Reno et ai., 2008; Mandii 
et al., 2001). PKCri was downreguiated in colon and hepatocel- 
lular carcinomas, and iower PKCt| expression was associated 
with poorer iong-term survivai (Davidson et ai., 1994; Lu et al., 
2009). Flowever, increased PKCi protein and DNA copy number 
ieveis have been observed in certain cancers (Perry et al., 2014; 
Regala et al., 2005). PKCi is part of the 3q26 ampiicon, and its 
increased DNA copy number ieveis correlate with increased 
mRNA expression (Figure S3). Flowever, DNA copy number 



Figure 5. Correction of a Fleterozygous LOF PKCp Mutation Reduces Growth in Soft Agar, Suspension, and a Xenograft Model 

(A) Immunoblot (left) and quantification (right; mean ±SEM) of PKCpil, PKCa, and GAPDH levels in the DLD1 cells. 

(B) Immunoblot (left) and quantification (right; mean ±SEM) of phospho-(Ser) PKC substrates. Comparisons were made using a repeated-measures one-way 
ANOVA followed by post hoc Dunnett’s multiple comparison test. *p < 0.05 as compared with the DLD1 parental cells. Data represent the mean of three in- 
dependent experiments ± SEM. 

(C) Relative viable cell number (mean ±SEM) as assessed by a trypan blue exclusion assay after 72 hr in suspension from three independent experiments. 
Comparisons were made by using a one-way ANOVA followed by post hoc Dunnett's Multiple Comparison test. ***p < 0.001 as compared with the DLD1 parental 
cell group. 

(D) Representative phase contrast images of DLD1 cells grown in suspension for 24 hr. 

(E) (Left) Colony formation assay in soft agar. (Right) Quantification of colony area (mean ±SEM) for colonies with a diameter >50 rim from three to six inde- 
pendent experiments. Comparisons were made using a one-way ANOVA followed by post hocTukey's multiple comparison test. ****p < 0.0001 and ***p < 0.001 
as compared with the DLD1 parental cell group. 

(E) Tumor growth is presented as the mean tumor volume (mm^) ±SEM, with the red representing data from mice injected with the DLD1 parental cells (A509TAA/T; 
five mice) and purple representing data of the three corrected clones (1 7 mice total). Comparisons were made using a two-tailed, unpaired Student's t test for 
each time point. **p < 0.005 and ***p < 0.0005. 

(G) (Top) Representative fields from TUNEL-stained slides of tumors derived from the DLD1 cells. (Bottom) Quantification of TUNEL-positive nuclei (mean ±SEM). 
Comparisons were made using a one-way ANOVA followed by post hoc Dunnett's Multiple Comparison test. ****p < 0.0001 as compared with the DLD1 parental 
cell group. 

See also Eigure S2. 
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and mRNA levels do not correlate for cPKC genes (Figure S3). In 
fact, for PKCot, copy number levels inversely correlate with pro- 
tein levels in breast cancer (Myhre et al., 2013), the cancer in 
which PKCa is most amplified (Cerami et al., 2012; Gao et al., 
2013). A number of studies reported increased mRNA expres- 
sion of other PKC genes in cancer; however, mRNA expression 
and protein levels often poorly correlate (Myhre et al., 2013). 
Thus, clinical data of this sort are consistent with a tumor-sup- 
pressive function of PKC isozymes, although there might be 
context specific exceptions for PKCu 

The recent discovery that germline LOF mutations in PKC6 are 
causal drivers of autoimmune lymphoproliferative syndrome and 
systemic lupus erythematosus, disorders associated with the 
acquisition of cancer-associated phenotypes, supports a bona 
fide tumor-suppressive role of PKC in humans (Belot et al., 
2013; Kuehn et al., 2013; Salzer et al., 2013). Both diseases 
are characterized by increased proliferation and decreased 
apoptosis of B cells (Belot et al., 2013; Kuehn et al., 2013), and 
patients frequently develop lymphomas (Bernatsky et al., 2005; 
Mellemkjaer et al., 1997). Moreover, we found that siblings ho- 
mozygous for a LOF PKC6 mutation have reduced levels of 
PKCi; (data not shown), supporting a dominant-negative role of 
LOF mutations. 

Flow could decreased PKC activity enhance tumorigenesis? 
One possibility is that PKC isozymes suppress oncogenic sig- 
naling by repressing signaling from oncogenes or stabilizing 
tumor suppressors. Supporting this, unbiased bioinformatic 
analysis of tumor samples harboring PKC LOF mutations re- 
vealed that TP53 (p53) is one of most frequently mutated genes 
in tumors harboring LOF mutations for each PKC isozyme (Ta- 
ble 2). PKC might promote the tumor-suppressive function of 
p53 by stabilizing the WT protein. Considerable evidence sug- 
gests that phosphorylation by PKC6 stabilizes p53, thus promot- 
ing apoptosis (Abbas et al., 2004; Yoshida et al., 2006), but the 
role of other PKC isozymes is less clear. KRAS was also among 
the top ten genes mutated in cancers harboring PKC mutations 
for seven of the PKC isozymes (Table 2), specifically with muta- 
tion at Gly12 (Table S3). This argues that PKC might suppress 
Kras signaling, such that loss of PKC would be required for 
Kras to exert its full oncogenic potential. Consistent with this, 
PKC modulates both the activity and localization of Kras through 
phosphorylation of Seri 81 (Bivona et al., 2006). Although the 
role of this phosphorylation site in tumors remains controversial 
(Barcelo et al., 2014), our analysis is consistent with loss of PKC 
enhancing its oncogenic potential. In fact, the DLD1 and FICT1 1 6 
cells used in our assays contained an oncogenic Kras mutation 
(G13D) that is necessary for the ability of these cells to grow in 
soft agar (data not shown). This suggests that LOF PKC muta- 
tions are not major cancer drivers but, rather, co-drivers that 
contribute to cancer progression. 

We also analyzed which kinase or cancer census genes (genes 
implicated in cancer) are significantly more commonly mutated 
(>1 5-fold) in tumors harboring PKC mutations versus tumors 
lacking PKC mutations (T able S4). This allowed us to identify pro- 
teins that might be important co-drivers or represent novel ge- 
netic dependencies for PKC. The tumor suppressor LATS2, 
which inhibits the Flippo pathway, and the kinases ROCK1 and 
ROCK2, which are required for the anchorage independent 



growth and invasion of non-small cell lung cancer cells, were 
among the top 20 mutated proteins that were significantly en- 
riched in tumors harboring PKC mutations (Table S4). Our anal- 
ysis suggests that mutations in these genes provide a greater 
proliferative advantage upon loss of PKC signaling. We also 
performed an analysis of cancer-specific genes frequently co- 
mutated with PKC in lung cancer, colorectal cancer, or mela- 
noma. This revealed very little overlap in co-mutated genes 
between the three cancers and also between the three classes 
of PKC isozymes (Table S5), suggesting that the individual 
PKC isozymes regulate distinct pathways in different cancers. 
Interestingly, cancers with a high PKC mutation burden, such 
as melanoma and colorectal cancers, show little PKC amplifica- 
tion. Conversely, cancers that have higher PKC amplification 
rates, such as breast and ovarian cancers, have few PKC muta- 
tions (Cerami et al., 2012; Gao et al., 2013), consistent with PKC 
mutations having a smaller or different role in breast and ovarian 
cancers. 

The foregoing data provide a mechanism for why inhibiting 
PKC has proved unsuccessful and, in fact, detrimental in cancer 
clinical trials: it is not gain of function but, rather, LOF that confers 
a survival advantage. Therefore, therapeutic strategies should 
target ways to restore PKC activity. Bryostatin-1 , a PKC agonist, 
also failed as a therapeutic and, in fact, exhibited counter-thera- 
peutic effects in cervical cancer (Nezhat et al., 2004), likely 
because it downregulates PKC (Szallasi et al., 1994). Therefore, 
strategies to activate PKC without downregulating it hold signif- 
icant clinical potential. An important ramification of this study is 
that drugs that inhibit proteins involved in the processing of 
PKC cause loss of PKC. Notably, both mTOR and HSP90 inhib- 
itors, currently in use in the clinic (Don and Zheng, 201 1 ; Neckers 
and Workman, 2012), prevent processing of PKC (Gouid et al., 
2009; Guertin et al., 2006) and would thus have the detrimental 
effect of removing its tumor suppressive function. Restoring 
PKC activity would have to accompany other chemotherapeu- 
tics, given that PKC isozymes act as the brakes, not the primary 
drivers, to oncogenic signaling. Our finding that decreased PKC 
activity enhances tumor growth challenges the concept of inhib- 
iting PKC isozymes in cancer and underscores the need for ther- 
apies that restore or stabilize PKC activity in cells. 

EXPERIMENTAL PROCEDURES 
FRET imaging and Analysis 

Cells were imaged as described previously (Gallegos et al., 2006). For activity 
measurements, cells were co-transfected with the indicated mCherry-tagged 
PKC and CKAR or plasma membrane-targeted CKAR, as indicated. For trans- 
location experiments, cells were co-transfected with the indicated YFP- 
tagged PKC and membrane-targeted CFP. 

Generation of CRISPR Cell Lines 

The CRISPR/Cas9 genome-editing system was employed to generate DLD1 
cell lines in which the PKCp A509T mutation was reverted to WT or knocked 
out. For the nuclease method, DLD1 cells were transiently transfected with 
the hSpCas9 vector containing the gRNA PKCp-a, the PAGE-purified 
70-mer ssODN (Figure S2B), and pMAX-GFP. For the double nickase method, 
DLD1 cells were transfected with two hSpCas9n vectors containing either 
gRNA PKCp-a or PKCp-b, the ssODN, and pMAX-GFP. GFP"^ cells were 
sorted 72 hr later. To reduce off-target mutagenesis, one of the clones (WT/ 
WT 53) was made using a double-nicking approach that requires the 
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Table 2. Top 20 Genes with Mutations that Co-Occur with PKC Mutations 



PKCa (50) 


PKCP (90) 


PKCy (102) 


PKC8 (47) 


PKCe (57) 


PKCii (51) 


PKCe (81) 


PKCr (48) 


PKCi; (28) 


BUD (7) 


TP53 (42) 


TP53 (52) 


KRAS (13) 


GNG4 (5) 


SPINK7 (5) 


TP53 (42) 


SPRR2G (6) 


TNP1 (3) 


TP53 (23) 


KRTAP6-2 (6) 


CDKN2A (17) 


TP53 (22) 


KRAS (11) 


RPL39 (3) 


CDKN2A (13) 


TP53 (26) 


TP53 (15) 


KRTAP19-5 (4) 


PCP4 (4) 


KRAS (16) 


CDKN2A (9) 


DEFB114(4) 


KRAS (11) 


KRAS (14) 


CDKN2A (10) 


CNPY1 (3) 


SPRR2E (4) 


KRAS (12) 


HTN1 (4) 


CD52 (3) 


CNPY1 (5) 


DEFB114(4) 


SPANXN5 (5) 


BANF1 (5) 


SPATA8 (3) 


REG3A (8) 


OR4A15 (21) 


SPRR2G (5) 


CNPY1 (4) 


SVIP (4) 


PLN (3) 


DEFB110{4) 


LACRT (7) 


SPANXN3 (4) 


H3F3C (6) 


POM 121 LI 2 (18) 


DEFB115(6) 


SPINK13(4) 


CXCL10(5) 


DEFB115 (5) 


KRTAP15-1 (8) 


CXCL9 (6) 


KRTAP19-5 (2) 


MLLT11 (4) 


REGIA (10) 


DNAJC5B (12) 


ATP5E (2) 


KRTAP19-3 (4) 


LELP1 (5) 


DEFB119(5) 


KRAS (9) 


VPREB1 (4) 


PI3 (5) 


NRAS(II) 


REG3G (10) 


RPL39 (2) 


COX7C (3) 


DEFB1 1 6 (5) 


PPIAL4G (9) 


RETNLB (5) 


GNG4 (2) 


SNURF (3) 


PLN (3) 


SPATA8 (6) 


COX7B2 (3) 


KRTAP19-8 (3) 


KRTAP19-8 (3) 


DPPA5 (6) 


WFDC1 OB (4) 


ATP6V1G3(3) 


CDKN2A (7) 


GNG4 (4) 


REGIA (9) 


OR4K1 (11) 


SPINK7 (4) 


lAPP (4) 


CRYGB (9) 


DEFB110 (3) 


CDKN2A (4) 


GNG3 (3) 


CDKN2A (9) 


POM121L12 (16) 


FDCSP (3) 


TP53 (18) 


NPS (4) 


SPANXN2 (9) 


TMSB15B (2) 


DEFB119 (2) 


DAOA (6) 


DEFA4 (5) 


TRAT1 (10) 


CARTPT (4) 


BAN FI (4) 


WFDC10B (4) 


KRTAP19-3 (4) 


GNG7 (3) 


LGALS1 (3) 


RPL39 (2) 


OR2L13(16) 


HIST1H2AA(7) 


DUSP22 (7) 


TMSB15B (2) 


S100A7L2 (5) 


DYNLRB2 (6) 


CNPY1 (4) 


SCGB1D1 (2) 


SVIP (3) 


LCE1 B (6) 


SPINK13(5) 


BANF1 (3) 


DEFA4 (4) 


CNPY1 (4) 


SPATA8 (5) 


LSM8 (4) 


NANOS2 (3) 


PEN (2) 


SPANXN3 (7) 


CCK (6) 


DYNLL2 (3) 


POM121L12 (12) 


TP53 (17) 


KRTAP19-8 (3) 


KRTAP19-5 (3) 


CCL17(2) 


FAM19A2 (5) 


KRTAP19-3 (4) 


OR4K1 (16) 


LYRM5 (3) 


GYPA (6) 


DPPA5 (5) 


RIPPLY3 (9) 


SPANXN5 (3) 


NRAS (4) 


CPLX4 (6) 


TRAT1 (9) 


OR4A5 (16) 


ATP6V1G3 (4) 


DYNLRB2 (5) 


DEFB131 (3) 


POM121L12 (14) 


CSTL1 (6) 


CCL1 (2) 


SEC22B (8) 


IFNB1 (9) 


CCL7 (5) 


DEFB128{3) 


HIST1 H2BB (5) 


SPINK13 (4) 


OR4N2 (14) 


DEFA4 (4) 


PATE4 (2) 


CTXN3 (3) 


KRTAP19-8 (3) 


B2M (6) 


MAPI LC3B2 (4) 


HIST1 H2BI (5) 


RPL10L(9) 


DEFB115(4) 


SPANXD (4) 


POM121L12 (6) 


KRTAP19-3 (3) 


KRTAP8-1 (3) 


PCP4 (3) 


GPX5 (7) 


FGFR10P2 (10) 


SPRR2A (3) 


OTOS (4) 


EDDM3A (6) 


CRIPT (2) 



Data were normalized based on gene length, and the number of co-occurring cases is listed in parentheses. Two genes are highlighted: TP53 is underlined, and KRAS is in bold. 
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cooperation between two nickase Cas9 enzymes (Ran et al., 2013). CRISPR- 
targeted clones were expanded and gDNA was extracted using a Quick-gDNA 
MiniPrep Kit (Zymo Research Corporation) and were screened for the pres- 
ence of two wild-type alleles by PCR using primers spanning the A509 locus, 
followed by restriction digest with BtgZI. This restriction site was only present 
in the WT allele, and correction of the A509T mutation introduced this site into 
the other allele. The presence a WT allele at both loci was confirmed by Sanger 
sequencing (Eton Bioscience). 

Xenograft Model 

Athymic Nude-Foxn7”“ mice (Harlan) were housed in compliance with the 
University of California San Diego Institutional Animal Core and Use Commit- 
tee. 3x10® DLD1 cells in 100 ).il PBS were injected subcutaneously into the 
right flank of each 4-week-old female mouse. Tumor dimensions were re- 
corded twice weekly and tumor volume was calculated as 1/2 x length x 
width^. Mice were euthanized 43 days after injection, and tumors were 
excised. One tumor was excluded, as it did not engraft well (DLDIp), and 
another was excluded, as it was not subcutaneous (WT/WT 31). 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, three 
figures, and five tables and can be found with this article online at http://dx.doi. 
org/1 0.1 01 6/j.cell.201 5.01 .001 . 
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SUMMARY 

Sensory circuits in the dorsal spinal cord integrate 
and transmit multiple cutaneous sensory modalities 
including the sense of light touch. Here, we identify 
a population of excitatory interneurons (I Ns) in the 
dorsal horn that are important for transmitting innoc- 
uous light touch sensation. These neurons express 
the ROR alpha (RORa) nuclear orphan receptor 
and are selectively innervated by cutaneous low 
threshold mechanoreceptors (LTMs). Targeted re- 
moval of RORa INs in the dorsal spinal cord leads 
to a marked reduction in behavioral responsiveness 
to light touch without affecting responses to noxious 
and itch stimuli. RORa IN-deficient mice also display 
a selective deficit in corrective foot movements. This 
phenotype, together with our demonstration that 
the RORa INs are innervated by corticospinal and 
vestibulospinal projection neurons, argues that the 
RORa INs direct corrective reflex movements by 
integrating touch information with descending motor 
commands from the cortex and cerebellum. 

INTRODUCTION 

Animals use the sense of touch to identify and discriminate 
nearby objects, direct motor movements, and reinforce social 
interactions via affective touch (Abraira and Ginty, 2013; 
McGlone and Reilly, 2010; Rossignol et al., 2006). Specialized 
LTMs in the skin detect a variety of touch modalities (Delmas 
et al., 201 1 ; Li et al., 201 1 ; Vrontou et al., 201 3). Skin deforma- 
tion and vibration are detected by rapidly adapting LTMs 
that innervate Meissner and Pacinian corpuscles, respectively, 
while slowly adapting Merkel cells and Ruffini organs respond 
to indentation and stretch (Delmas et al., 2011; McGlone and 
Reilly, 2010). C-LTMs and A|3/A6 lanceolate LTMs in hairy 
skin monitor the dynamic and static displacement of hair. 
Our knowledge of how specific somatosensory modalities, 
including touch, are relayed and gated within the dorsal spinal 
oord is more limited, in part because the functional studies un- 
dertaken so far have primarily employed genetic mutations 
that alter a broad swath of neurons (Cheng et al., 2004; Gross 
et al., 2002; Muller et al., 2002; Wang et al., 2013; Xu et al., 
2013). Moreover, it is still unclear how cutaneous touch path- 
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ways intersect with descending motor pathways to control 
movement and posture. 

The peripheral pathways that transmit thermoceptive, noci- 
ceptive, innocuous mechanosensory, and proprioceptive stimuli 
are highly segregated (Basbaum et al., 2009; Lallemend and Ern- 
fors, 2012). Unmyelinated C-nociceptors and lightly myelinated 
A6 afferents that relay nociceptive, thermoceptive, and prurito- 
ceptive stimuli project to laminae I/ll of the dorsal horn (Lalle- 
mend and Ernfors, 2012; Todd, 2010), while large diameter fast 
conducting neurons that carry proprioceptive information inner- 
vate neurons in the intermediate and ventral spinal cord (Jan- 
kowska, 1992; Lallemend and Ernfors, 2012). Innocuous touch 
modalities are transmitted by myelinated A-LTMs and unmyelin- 
ated C-LTMs that converge on laminae Ili-IV (Abraira and Ginty, 
2013; Delmas et al., 201 1 ; Lallemend and Ernfors, 2012; Lechner 
and Lewin, 2013). These and other findings argue that somato- 
sensory information is encoded in the periphery by labeled lines 
of transmission. However, the extent to which these labeled lines 
extend into the CNS remains to be determined. It is known that 
outaneous LTMs project and arborize within the dorsal horn in 
a modality-specific manner (Brown, 1981; Fyffe, 1992; Light 
and Perl, 1979a, 1979b; Shortland and Woolf, 1993; Woolf, 
1 987). Hair follicle afferents terminate in iaminae Mi and III (Wood- 
bury et al., 2001). Meissner and Merkel cell afferents innervate 
laminae III and IV (Brown, 1981; Woolf, 1987), while the arbors 
of Pacinian corpuscle afferents localize to iamina lll/dorsai iam- 
ina IV and lamina V (Brown, 1981; Semba et al., 1984). Finally, 
Ruffini organ afferents form collaterals in lamina III and have 
processes that extend into laminae IV/V (Brown, 1981). This 
anatomicai organization suggests that defined mechanosensory 
modalities may be transmitted and processed by discrete IN cell 
types in the spinal cord. 

In this study, we show that a class of dorsal spinal cord INs ex- 
pressing the RCRa nuclear orphan receptor are required for 
proper light touch perception. Ablating RCRa INs in the caudal 
spinal cord markedly reduces the behavioral responses mice 
display to light touch, but not to noxious or itch stimuli. Mice 
lacking RORa INs also show a marked increase in foot slips 
during beam walking, indicating the RORa INs are necessary 
for corrective foot movements and fine motor control. This motor 
deficit, in combination with neuronal tracing experiments 
showing the RORa INs are innervated by projection neurons in 
the lateral vestibular nucleus (LVN) and motor cortex, reveals a 
role for the RORa INs in integrating sensory inputs from cuta- 
neous LTMs with descending motor signals from the cortex 
and vestibular system. In so doing, it provides evidence that 

Cell 160 , 503-515, January 29, 2015 ©2015 Elsevier Inc. 503 





Cell 



. RORa^^^; Tau ^sL-^'^'acz 

A — 




RORa^^^; R 26 '-®l-tva 



RORa^’^; f? 26 '-SL-«Tomato 





Figure 1. Characterization of /70/7a‘^''‘’-Derived INs in the Spinal Cord 

(A-C) Images from P1 0 mice showing p-gaiactosidase expression in the CNS (A) and spinal cord (B and C). (C) shows transverse section 

through the spinai cord (SC). Note the dorsai root gangiia (DRG) do not express p-gaiactosidase. 

(D) Section through PI 0 RORa®™; R26'-®'- lumbar spinal cord stained with antibodies to RORa (green) and NeuN (blue), tdlomato"^ fluorescence (red) was 
visualized without staining. Merged image shows tdTomato is largely restricted to RORaVNeuN"^ neurons in laminae lli/lll. ThetdTomato"^ cells located outside of 
laminae ll/lll do not express NeuN. Arrows indicate double-labeled neurons. Arrowhead in (D) indicates a tdlomato"^ blood vessel. Asterisk marks a tdlomato"^ 
glial cell. 

(E-F) mCherry-labeled neurons (red) in the lumbar cord of P21 RORa®™; R26'"®'" mice. Their location in relation to excitatory PKCy* neurons (green) and 
vGluTt sensory afferents (blue) is shown. 

(G) Summary of the morphological profile of 51 RORa INs. 

Scale bars, 1 ,000 pm (A), 500 pm (B), 200 pm (C), 1 00 pm (D), 25 pm (E and F). See also Figure S1 . 



the RORa INs function as an important interneuronai node for 
coordinating descending motor commands with cutaneous 
mechanosensory feedback. 

RESULTS 

RORa Identifies a Distinct Population of Excitatory 
Neurons in Laminae lli/lll of the Spinal Cord 

The neurons in the spinal cord that express RORa are primarily 
restricted to laminae lli/lll (Del Barrio et al., 2013). In mice 
carrying both the RORa'^™ allele (Ohou et al., 2013) and the 
reporter allele (this study), p-galactosidase (3-gal) 
expression was localized to two bilateral columns of neurons in 



the dorsal horn (Figures 1A-10). Most importantly, 3-gal was 
markedly absent from sensory neurons in the dorsal root ganglia 
(DRG) (Figure 10). Transverse sections through the spinal cord 
showed that the RORa INs are largely restricted to laminae III/ 
III, with the intermediate and ventral spinal cord being completely 
devoid of RORa expression (Figures 10 and SI A). 

T o confirm that Ore recombination faithfully recapitulates RORa 
expression, P10 RORa'^''®; spinal cords were 

stained with antibodies to DsRed (tdTomato), RORa, and NeuN 
(Figure 1 D). A total of 94.2% ± 1 .3% of the RORaVNeuN"^ neurons 
in laminae lli/lll expressed tdTomato. Oonversely, 91.8% ± 2.9% 
of the tdTomatoVNeuN"^ neurons expressed RORa. Scattered 
non-neuronal NeuN-negative tdTomato"^ cells were present 
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Figure 2. RORoc Identifies a Population of 
Excitatory Interneurons in the Dorsal Horn 
of the Spinal Cord 

(A and B) Transverse sections through P10 
RORa^'“\ ^26'-®'--“'^°™*° lumbar dorsal spinal 
cord showing RORa is coexpressed with Lmx1b, 
but not with Pax2. 

{C and D) Sections from P1 0 RORa^^^; 
^26LSL-tdTomato. Qgd^.QFp gpjna| cord showing 
tdTomato is coexpressed with vGluT2 mRNA (C, 
green), but not GFP (D, green). 

(E) Quantification of dorsal spinal cord marker 
expression in RORa INs. Cell counts obtained 
from three spinal cords were pooled and ex- 
pressed as mean ± SEM. Cell counts for RORa INs 
expressing Lmx1b (940/1093), vGluT2 (904/987), 
Pax2 (15/1197) and GAD1 (22/845). 

Scale bars, 1 00 ^m (A and B), 1 0 |.im (C and D). See 
also Figure S2. 




outside of laminae lli/lll (Figure 1 D, asterisk). These cells are pre- 
dominantly endothelial cells, but also include some glial cells. 
Anatomical analysis revealed that most RORa INs possess radial, 
vertical or central cell-like morphologies (Figure 1G), and have 
dendritic fields that are largely confined to laminae lli/lll and dorsal 
lamina IV (Figures 1 E, 1 F, S1 B and S1 C). 

In keeping with their morphology, 86.1 % ± 2.7% of the RORa- 
tdlomato"^ INs expressed Lmxlb (Figures 2A and 2E), which 
marks glutamatergic INs in the dorsal horn (Gross et al., 2002; 
Muller et al., 2002). By contrast, very few RORa-tdlomato"^ INs 
(1 .2% ± 0.5%) expressed the inhibitory marker Pax2 (Figures 
2B and 2E). Further comparison of tdTomato and vesicular gluta- 
mate transporter vGluT2 expression confirmed that the RORa"^ 
INs are primarily glutamatergic, with 91 .6% ± 0.4% coexpress- 
ing vGluT2 (Figures 2C, 2E, and S2A-S2C). Moreover, a large 
fraction of them coexpressed cholecystokinin (Figure S2F), 
which is restricted to excitatory neurons in the dorsal horn (Flu 
et al., 2012; Xu et al., 2013). By contrast, only 3.1% ± 0.9% of 
the RORa INs in Gad1-GFP mice expressed GFP (Figures 2D 
and 2E). Further analysis revealed two major molecular subsets 
of RORa INs: a dorsal population that colocalizes with PKCy 
(Figures S2D and S2F) in lamina II inner, and a more ventral pop- 
ulation of MafAVc-Mar/RORa"^ cells that are restricted to lamina 
III (Figures S2E and S2F). 

Innervation of RORa INs by Mechanosensory Neurons 

In view of their location in laminae lli/lll, we set out to determine if 
the RORa INs receive afferent input from innocuous mechano- 
sensory neurons. CGRP and IB4 stainings were used to compare 
the location of the RORa INs with the termination zones of pep- 



tidergic and non-peptidergic nociceptive 
afferents (Todd, 2010). The cell bodies 
of RORa INs were located ventral to 
CGRP"^ and IB4'^ afferents in laminae I/ll 
and II, respectively (Figure S3A). Only 
rarely did we detect putative contacts 
from nociceptive afferents onto RORa 
INs, indicating the RORa INs receive 
very little direct nociceptive input. By 
contrast, when a c-Ret-CFP transgene (Uesaka et al., 2008), 
was used to trace the central processes of mechanosensory 
afferents, we identified numerous CFP'^/vGluTT^ synaptic con- 
tacts onto RORa INs (Figure S3B). These findings provide evi- 
dence that the RORa INs are innervated by cutaneous LTMs 
as opposed to cutaneous nociceptors. 

Monosynaptic retrograde tracing using a pseudotyped EnvA- 
SADAG-mCherry rabies virus (Wickersham et al., 2007) was em- 
ployed to identify the sensory neuron cell types that innervate 
spinal RORa INs. These experiments were performed using an 
intersectional reporter (Li et al., 2013; Stam et al., 2012) that in 
combination with RORa^'^ and Cdx2::FlpO restricts rabies virus 
entry to RORa INs in the caudal spinal cord (Figures 3A-30). 
Sensory neurons are not directly infected due to their lack of 
RORa expression. Injection of EnvA-SADAG-mCherry rabies vi- 
rus into the spinal cords of P5 ROffa’^'”®; Cdx2::FlpO\ 
mice, resulted in mOherry expression in multiple cutaneous 
LTM cell types, including cells with the following expression pro- 
files: c-Ref"/IB4^ Ap-LTMs (Figures 3D and 3K; 24.7% ± 2.3%), 
TrkO'^/parvalbumin^ Ap-LTMs (Figures 3E and 3K, 22.0% ± 
2.5%), and TrkB-" AS-LTMs (Figure 3F and 3K, 20.3% ± 2.3%). 
Very few ^4"^ non-peptidergic nociceptors (Figures 3D and 3K, 
0.66% ± 0.33%) and parvalbumin"^ proprioceptors (Figures 3E 
and 3K, 3.1% ± 1.9%) were labeled, demonstrating the RORa 
INs are primarily innervated by LTM afferents. We also identified 
a population of presynaptic neurons that express CGRP and 
NF200 (Figure 3K and S3G), which are likely to be a subtype of 
mechanoreceptor (Lawson et al., 2002). Our failure to detect 
mCherry-labeled tyrosine hydroxylase"^ C-LTM neurons (Fig- 
ure 3G; Li et al., 2011) indicates that Ap-LTMs and A6-LTMs, 
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Figure 3. Transsynaptic Rabies Virus 
Tracing of Sensory Inputs to RORa INs 

(A and B) Schematic showing the ailele 

and strategy used fortracing monosynaptic inputs 
to the RORa iNs using an EnvA-pseudotyped AG- 
mCherry-rabies virus. 

(C) Section through P10 RORa^^^; Cdx2::FlpO; 

gpjpgi showing mCherry-rabies 
virus labeled cells and RORa INs (green). 
mCherryVGFP' cells represent transsynaptically 
labeled presynaptic neurons. 

(D-G) Sections from P10 RORa^^^; Cdx2::FlpO; 

lumbar DRG showing presynaptically 
labeled sensory neurons and sensory neuron 
subtype markers as indicated. 

(H) Section of hindlimb footpad stained with anti- 
bodies to mCherry (red) and NF (blue) showing 
Meissner corpuscle innervation. 

(I) Section of hindlimb hairy skin stained with an- 
tibodies to mCherry (red), NF (blue), and TrpVS 
(green), which recognizes Merkel cells. 

(J) Section of hairy skin showing a D-hair afferent 
following staining with antibodies to mCherry (red), 
TrkB (green) and DAPI (blue). 

(K) Quantification of sensory neuronal markers in 
relation to total number of mCherry"^ neurons, n = 3 
mice. 

(L) Summary of sensory afferent types that are 
presynaptic to the RORa INs. 

Abbreviations: NF, neurofilament; CB, calbindin; 
CR, calretinin; PV, parvalbumin; TH, tyrosine hy- 
droxylase. Dashed lines define the boundary of the 
spinal cord (C), DRG (D-G), dermal papillae (H) and 
hair follicles (J). Arrows indicate double-labeled 
neurons or sensory afferents. Scale bar, 50 |.im 
(D-G), 25 ^im (H-J). See also Figure S3. 





Cdx2::FlpO; 



rather than C-LTMs, are the primary source of mechanosensory 
input to the RORa iNs. 

Our observation that rabies-derived mCherry iabeis the distai 
processes of sensory neurons aiiowed us to unambiguousiy 
identify the mechanosensory neuron celi types that innervate 
the RORa INs. mCherry VNF200'^ A(3 afferent fibers were pre- 
sent in the dermai papiiiae of giabrous skin where Meissner 
corpuscles are located (Figure 3H). This is consistent with 
mCherry-rabies labeling of calbindin"^ DRG afferents that inner- 
vate Meissner corpuscles (Figure S3E). Calretinin"^ DRG neurons 



□ Ret+/IB4- 

□ TrkC+/PV- 
■ TrkB+ 

□ CGRP+/NF+ 

□ IB4+ 

□ PV+ 

□ CB+ 

BCR+/PV- and their associated Pacinian corpuscle 

nerve endings did not express mCherry 
(Figures S3C and S3D), indicating the 
RORa cells are not innervated by Paci- 
nian LTMs. In hairy skin, we observed 
mCherry-labeled afferents innervating 
Merkel cells (Figure 3I). A8 D-hair termi- 
nals (Figure 3J) and transverse lanceolate 
endings (Figure S3H) were also retro- 
gradely labeled. The morphologies of 
the hairy skin LTMs labeled by rabies 
virus are consistent with the expression 
profile of mCherry in DRG neurons, as 
D-hair and lanceolate neurons express 
a combination of c-Ret, TrkB, or TrkC (Figure 3L) (Lallemend 
and Ernfors, 2012). mCherry"^ afferents were occasionally seen 
together with Ruffini organ endings in the foot muscles (Fig- 
ure S3F), although the exact extent of RCRa IN innervation by 
Ruffini LTMs remains to be determined. 

RORa INs Are Innervated by A-Fiber Afferents and Are 
Selectively Activated by Light Touch 

Whole-cell recordings from RCRa INs in lamina III revealed a 
spectrum of excitatory potentials that closely match the profile of 
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Figure 4. RORa INs Are Functionally Innervated by Low Threshold Mechanosensory Afferents 

(A) Example of a RORa IN in L4 displaying an A fiber monosynaptic current triggered by iow intensity stimulation. Monosynaptic connectivity was confirmed by a 
2 Hz frequency stimulation. 

(B) Exampie of a RORa iN with duai A and C inputs. 

(C) Ciassification of potentiais in RORa iNs induced by dorsai root stimuiation. n = 31 celis. 

(D-i) Sections through the lumbar spinal cord of P6 RORa^'“; mice stained with antibodies to c-Eos and (3-gal after light brushing (D-E) or injecting 

capsaicin into the footpad (G-l). Arrowheads in (E) and (I) indicate e-Eos'^ RORa INs. (E) and (H) show the contralateral side. 

(J) Fraction of RORa INs (p-gal^ coexpressing c-Fos. Only laminae lli/lll cells located within the domain displaying elevated c-Fos expression were counted. Cell 
counts of RORa INs exhibiting c-Fos immunoreactivity: brush stimulation of glabrous skin, 1 593/2830 RORa cells; brush stimulation of hairy skin, 834/21 03 RORa 
cells; capsaicin injection, 20/71 1 RORa cells. 

n = 4 animals for each experiment. Data: mean ± SEM. Scale bar, 100 pm (D, E, G, H). 



presynaptic connections seen in the rabies tracing experiments. 
Short iatency-iow threshoid potentiais (< 7 ms) with no faiiures 
and iow jitter were detected in 11 of 31 ceiis foiiowing dorsai 
root stimuiation (Figure 4A). These potentiais most iikeiy repre- 
sent monosynaptic A|3 fibers, as ia muscie spindie afferents do 
not terminate in iamina III (Brown, 1981; Jankowska, 1992). 
Low to intermediate threshoid potentiais that are iikeiy to be 
either monosynaptic A6 or A poiysynaptic inputs were detected 
in another 12 ceiis (Figure 4C). Six ceiis dispiayed iong iatency- 
high threshoid potentiais (>25 ms) that are characteristic of 
poiysynaptic C-fiber inputs. Interestingiy, two of these ceiis 
possessed an additionai short iatency-iow threshoid input, indi- 
cating some RORa iNs receive a combination of myeiinated 



A and unmyeiinated C fiber innervation (Figure 4B). The exact 
nature of this poiysynaptic C fiber innervation remains to be 
determined, aithough it may be derived from C-LTMs that sense 
pieasant touch (Liu et ai., 2007; Vrontou et ai., 2013). 

We then asked if light brush stimulation activates the immedi- 
ate eariy gene c-Fos in RORa INs. Light brushing of the plantar 
(underside) surface of the foot (Figures 4D-4F and 4J), and the 
hairy skin of the iowerhindiimb (Figure 4J), resulted in expression 
of c-Fos in iaminae iii/lli RORa INs. By contrast, injection of 
capsaicin into the foot resuited in very iittie c-Fos induction 
in RORa INs (3.1% ± 0.9%; Figures 4G-4i and 4J). instead, 
c-Fos induction was iargely restricted to iaminae i/ii and lamina 
V. Some activation of c-Fos was aiso observed in iaminae i/il 
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neurons in response to brushing (Figure 4D). This may reflect 
sensitization due to the prolonged brush stimulation needed 
for c-Fos induction in laminae lli/lll neurons, or it might be due 
to the activation of nociceptive pathways by innocuous touch 
at early postnatal stages (Koch et al., 2012). 

Ablating RORa INs in the Dorsal Spinal Cord Impairs 
Light Touch 

To determine the functional contribution RORa INs make to so- 
matosensation, we employed a recently developed intersec- 
tional allele (Figure 5A) to inducibly ablate the RORa 

INs in the adult spinal cord. When used in combination with 
RORa'^™ and a Cdx2::FlpO transgene, which is only expressed 
in caudal regions of the mouse (O.B and M.G, unpublished 
data), we were able to selectively target diphtheria toxin receptor 
(DTR) expression to RORa INs in the caudal spinal cord. ROR- 
a-expressing cells in the anterior ONS and in non-neuronal 
tissues were not affected by diphtheria toxin (DTX) treatment 
(Figures 5B and 50). Using a ^26'-®'-"“^°"^®'° reporter to trace 
the RORa cells, we estimate that >95% of the RORa INs in the 
lumbar cord were ablated by DTX treatment (Figures 5D-5F). 
Oell killing was largely restricted to excitatory Lmxlb"^ INs in 
laminae lli/lll (Figures 5E and 5G), where RORa and Lmxlb are 
coexpressed, with the number of inhibitory Pax2'^ neurons in 
the dorsal horn remaining unchanged (Figure 5H). 

To confirm cell ablation is restricted to RORa INs, we 
analyzed the expression of several excitatory dorsal horn 
markers (Brohl et al., 2008; Del Barrio et al., 2013; Gross 
et al., 2002; Muller et al., 2002). As expected, there was a cor- 
responding reduction in the number of cells expressing PKOy, 
MafA, c-Maf, and calbindin (Figures S4A-S4D), while excitatory 
calretinin"^ INs that do not express RORa and GADT^ inhibitory 
INs were unaffected (Figures S4E and S4F). We also verified 
that RORa IN ablation does not alter the laminar organization 
of sensory afferent terminals in dorsal spinal cord. OGRP"^ and 
^4"^ nociceptive terminals projected normally to laminae I/ll 
(Figures S4G and S4FI) and there was no gross change in the 
central projections of vGluTT^ mechanosensory afferents (Fig- 
ures S4I and S4J). The termination patterns of peripheral nerve 
endings (Figures S4K-S4L) were also unchanged following 
RORa IN-ablation demonstrating that the loss of these cells 
does not cause any major change in central and peripheral sen- 
sory innervation. 

RORa IN-ablated mice were then subjected to a battery of 
sensory tests. Strikingly, mice lacking spinal RORa INs displayed 
a specific sensory impairment in dynamic and static light touch. 
In particular, the response to light brushing on the plantar surface 
of the foot was markedly diminished as compared to control lit- 
termates (Figure 51; 30% ± 7.9% versus 76% ± 6.5% of trials). 
Furthermore, the latency to detecting a piece of sticky tape on 
the plantar surface of the foot was also increased 2.5-fold (Fig- 
ure 5J; control 78.9 ± 17.7 s versus RORa IN-ablated 192.9 ± 
20.4 s), thereby demonstrating a substantial reduction in respon- 
siveness to static touch. We also observed a marked reduction in 
tactile sensation on the hairy skin of mice lacking RORa INs (Fig- 
ure 5K) indicating the RORa INs contribute to light touch percep- 
tion in hairy skin. By contrast, RORa IN-deficient mice displayed 
no change in their responses to mechanical pain (Figures 5L-5N) 



or sensitivity to heat and cold (Figures 50 and 5P). Responses to 
chemically induced itch were aiso unchanged (Figures 5Q and 
5R). Finally, there was no difference in acute pain sensation 
following the injection of capsaicin and formalin (Figures 5S 
and 5T). In summary, depleting RORa INs from the dorsal spinal 
cord results in the selective loss of light touch without altering 
responsiveness to mechanical, thermal, and chemical pain, 
and to chemical itch. 

RORa IN-Ablated Mice Display Deficits in Corrective 
Motor Movements 

Ourability to genetically manipulate the RORa IN population led us 
to examine the contribution that the RORa I Ns make to motor con- 
trol in mice (Figure 6). Gross behavioral analyses failed to uncover 
any pronounced change in locomotor activity in RORa IN-ablated 
mice (Figure S5), with most measures of locomotor function being 
largely normal. There was no difference between control and 
RORa IN-ablated mice in total distance traveled, vertical activity, 
ortotai number of jumps performed during open-field testing (Fig- 
ures S5A-S50). Muscle strength, as assessed by the hanging wire 
test, was also normal in the RORa IN-ablated mice (Figure S5D). 
There was also no marked change in gross motor coordination 
as assessed by the accelerating rotarod and limb coordination 
during treadmill and ladder beam walking (Figures S5E-S5I). 

By contrast, when the raised beam test was used to assess 
fine motor control during locomotion, a significant increase 
in hindlimb missteps and slips was seen in RORa IN-ablated 
mice traversing a 5 mm beam as compared to control mice 
(Figures 6A-6C (3.73 ± 0.73 episodes per crossing for RORa 
I N-ablated mice versus 1 .29 ± 0.3 episodes per crossing for con- 
troi mice)). Most notably, this increase in slips/footfalls was 
restricted to the hindlimbs, which is consistent with the depietion 
of RORa INs at hindlimb levels but not at forelimb levels. In sum, 
our results demonstrate that RORa IN-mediated feedback to the 
motor system is essential for corrective movements, while being 
largely dispensable for gross motor movements. 

RORa INs Form Synaptic Contacts with Motor Neurons 
and Premotor Neurons 

T o further probe the nature of the RORa I N motor phenotype, we 
asked if the RORa I Ns provide excitatory inputs to motor neurons 
and molecularly identified premotor IN cell types that control 
locomotion in mice (Arber, 2012; Goulding, 2009; Grillner and 
Jessell, 2009; Kiehn, 2011). To visualize the terminai processes 
of RORa INs, RORa^'^ mice were crossed with conditional 
reporter mice (Buffelli et al., 2003). We observed mul- 
tiple YFPVvGluT2'^ contacts on motor neurons in the lumbar 
lateral motor column (Figure 6D). These vGluT2'^ contacts are 
likely to be functional synapses as we find mCherry-positive 
premotor RORa INs in the lumbar spinal cords of 
Tau'-®'-''®'^^ mice following injection of SADAG-mOherry rabies vi- 
rus and /\AV-G into the gastrocnemius (ankle extensor) and tibia- 
lis anterior (ankle flexor) muscles (Figures 6E-6G). We also looked 
for YFP'^/vGluT2'^ contacts on lamina X VOc cholinergic neurons 
(Stepien etal., 2010; Zagoraiou et al., 2009), which are the source 
of muscarinic cholinergic inputs to motor neurons (Miles et al., 
2007). RORa IN-derived contacts were observed on the soma 
of these cells (Figure 6FI), suggesting the RORa INs are a source 
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Figure 5. Spinal Ablation of RORa INs Impairs Touch but Not Nociceptive Behaviors 

(A) Schematic illustrating the allele. 

(B and C) Brain and spinal cords from P84 RORa^^^\ (control) and RORa^^^] Cdx2::FlpO] (RORa IN-ablated) mice showing p-galactosidase 

reporter expression. 

(D and E) Transverse sections through the lumbar dorsal horn of adult control (D) and RORa IN-ablated (E) mice 14 days after injecting DTX showing tdTomato 
(red) and Lmxlb (green) expression. Arrows indicate double-labeled neurons. 

(F) RORa"^ IN numbers are reduced by 95% following DTX-ablation (3.16 ± 0.46 cells versus 70.30 ± 6.01 cells), ***p < 0.001 . 

(G) Lmxib-expressing cells are reduced by 30% in RORa IN-ablated cords (1 12.1 ± 6.9 cells versus 161 .0 ± 9.5 cells), *p < 0.05. 

(H) Pax2'^ cell numbers are unchanged in RORa IN-ablated cords (113.1 ± 4.7 cells versus 113.9 ± 2.9 cells), p > 0.05. 

(I) RORa IN-ablated mice show a significant decrease in paw withdrawal to dynamic light brush (30% ± 7.9% versus 76% ± 6.5% of trials, ***p < 0.001). 

(J and K) RORa IN-ablated mice show a significant increase in latency to static light touch as measured using the sticky tape test (J, 192.9 ± 20.4 s versus 78.9 ± 
1 7.7 s, ***p < 0.001 ) and detecting a 1 .5 mm alligator clip on the hairy skin (K, 47.3 ± 1 1 .5 to 20.9 ± 4.2, *p < 0.05). 

(L-N) Mechanical pain measured by von Frey filament, Randall Selitto and pinprick does not differ between control and RORa IN-ablated mice. 

(0-R) Sensitivity to heat (Hargreaves) and cooling (acetone) in RORa IN-ablated mice is normal, as are responses to 48/80 and chloroquine. 

(S and T) Chemical pain induced by injecting capsaicin or formalin into the footpad is also unchanged in RORa IN-ablated mice. 

Data: mean ± SEM. p values above 0.05 are not significant (ns), n = number of mice tested. Scale Bar: 100 |.im. See also Figure S4. 



of excitatory drive to the VOc IN popuiation. DsRed'^/vGluT2'^ 
bouton-like contacts were also found on CPP"^ V2a INs in 

mice indicating putative 
synaptic connections between RORa INs and ChKlO"^ V2a INs 



(Figure 61). These analyses are likely to substantially underesti- 
mate the number of putative RORa IN-derived synaptic contacts 
on premotor and motor neurons, given that excitatory contacts 
are most abundant on dendrites. Taken together, these tracing 



Cell 160 , 503-515, January 29, 2015 ©2015 Elsevier Inc. 509 

















Cell 



Control 



RORa IN-ablated 



B 




6 - 

(0 

Q. 

5 4- 
o 

0) 

|2H 

D 

z 

0 



5mm 12mm 



CvJ 

I- 

(J 

> 

I- 

< 

O 

CL 

LL 



RORa^'^; Thy1^ 



[V'>^ i/i > 

■- r ~ 

’ A -.q 

A.' 

v '<>- i —- 

7 



RORa^'^; Thy1^ 



RORa^'^; Tat/'-®'--'^ 






RORa^’^; R 26 l®l ' 



Chx1(P^^ 




studies provide evidence that excitatory RORa I Ns relay touch in- 
formation from cutaneous LTMs to the spinal motor system. 

In view of the contribution that the corticospinal tract (CST) and 
lateral vestibulospinal tract (LVST) pathways make to fine motor 
control and balance (Armstrong, 1988; Ito, 2012), we asked 
whether CST neurons and LVST neurons synapse directly onto 
the RORa INs in the lumbar spinal cord. Targeted infection of lum- 
bar level RORa INs with EnvA-SADAG-mCherry rabies virus re- 
sulted in monosynaptic rabies virus labeling of LVST neurons in 
the lateral vestibular nucleus (Figure 7A). These cells, which are 
innervated by calbindin"^ Purkinje cells (Figure 7A, inset, arrows), 
constitute a major vestibular efferent pathway from the cere- 
bellum to the spinal cord (Ito, 2012). We also found multiple 
mCherry-labeled neurons (Figure 7B) in lamina V of the contralat- 
eral mouse primary motor cortex that had the typical pyramidal 
morphology of corticospinal projection neurons (Figures 7B and 
70). When whole-cell recordings were performed on RORa INs 
in the lumbar cord (L4), eight of ten RORa INs displayed mono- 
synaptic excitatory potentials in response to stimulating A fiber 
sensory afferents and the ventral dorsal funiculus that contains 
the axons of corticospinal projection neurons (Figures 7D-7F). 
The protocol that we used has been shown to selectively activate 
descending corticospinal axons (Flantman and Jessell, 2010), 



Figure 6. Corrective Movements Are 
Impaired in RORa IN-Ablated Mice 

(A and B) Images showing control (A) and RORa 
IN-ablated (B) mice crossing a 5 mm wide beam. 
Arrow Indicates position of the foot. Note the po- 
sition of the foot in B showing a foot slip. 

(C) Number of slips and footfails for 5 mm and 
12 mm wide beams. White circles indicate control 
mice (17 and 11 mice tested for 5 and 12 mm 
beams, respectively). Black circles indicate RORa 
IN-ablated mice (19 and 12 mice tested for 5 and 
1 2 mm beams, respectively). 

(D and H) P21 RORa‘=™; spinal cord 

sections stained with antibodies to GFP (green), 
vGluT2 (blue) showing synaptic boutons (arrows) 
on ChAT"^ motor neurons (D), and ChAT"^ VOc 
neurons (H). 

(E-G) Retrograde monosynaptic mCherry-rabies 
virus labeling of RORa INs following injection into 
the tibialis anterior (F) and gastrocnemius muscles 
(G). The tracing protocol is shown in (E). 

(I) P21 RORa°™; 026'-®'-'“’^°"’=’*°; ChxlO®®'’ spinal 
cord section stained with antibodies to DsRed 
(red); GFP (green) and vGluT2 (blue), showing 
synaptic boutons (arrows) on a ChxlO"^ V2a IN 
(green). 

Data: mean ± SEM, **p < 0.01 , ns, p > 0.05. Scale 
bars, 5 nm (D), 1 0 rim (F and G), 2 rim (H and I). See 
also Figure S5. 



however, some synaptic potentials 
might arise from the antidromic stimula- 
tion of ascending dorsal column axons. 
Additional evidence for the presence of 
convergent corticospinal and sensory 
inputs onto the RORa INs comes from 
anatomical studies showing lumbar 
RORa INs are decorated with PKCy'^/vGluTT^ and Emxl-GFPV 
vGluTT^ contacts from corticospinal projection neurons (Figures 
7G and 7FI). These same cells are also contacted by vGluTT^ 
processes, which are likely to be derived from mechanosensory 
afferents (Figures 7G and ft, arrowheads). 

DISCUSSION 

This study identifies a specific class of neurons in the dorsal spi- 
nal cord that has an essential role in sensing light touch. The 
RORa INs transmit innocuous mechanical stimuli from both the 
hairy and glabrous skin, and depleting the spinal cord of RORa 
INs leads to a selective mechanosensory deficit that closely 
matches the repertoire of inputs these cells receive from cuta- 
neous LTMs. We propose that the RORa INs serve as an integra- 
tive node that merges sensory input from cutaneous LTMs with 
descending signals from the cortex and cerebellum to generate 
the postural adjustments and corrective foot movements that are 
used to counteract foot slippage. 

Coding of Mechanical Stimuli by RORa INs 

Our results showing discriminative light touch behaviors are 
impaired in RORa IN-ablated mice (Figures 51 and 5J), whereas 
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Figure 7. Convergence of Sensory and 
Descending Inputs onto RORa INs 

(A-C) Transverse sections from a PI 5 
Cdx2::FlpO; brain 5 days 

after infecting RORa INs in the lumbar cord 
with an EnvA G-deleted rabies-mCherry virus. 
Transsynaptically labeled neurons (red) in the 
lateral vestibular nucleus (LVN) are innervated 
by calbindin"^ Purkinje cells afferents (green, 
A). Pyramidal neurons in the frontal motor 
cortex area (FMA) are transsynaptically labeled 
(B and C). 

(D) Hemicord preparation showing the stimu- 
lation-whole patch recording set-up for RORa 
INs. The region of the dorsal funiculus that 
contains corticospinal axons was stimulated 
with a concentric bipolar electrode 5 segments 
(3-4 mm) rostral to the recorded cell (I). A 
single dorsal root (L4-L6) was stimulated with 
a glass suction electrode (ii). RORa cells 
(green) in L4 were patched and EPSCs were 
recorded (iii). R, rostral; C, caudal; D, dorsal; 
V, ventral. 

(E) Representative recording of an A fiber-type 
dorsal root evoked potential in a RORa IN. 
Magnification of onset is shown. 

(F) Putative corticospinal tract evoked EPSCs in 
the RORa IN that is shown in (E). Latency and 
jitter properties are consistent with mono- 
synaptic connectivity (see magnification of 
onset). 

(G) Section from a P21 RORa^^; 
spinal cord showing PKCyVvGluTI'^ corticospinal 
processes contacting a RORa IN (red). Arrow in- 
dicates a PKCy^vGIuTI"^ corticospinal-derived 
synaptic bouton. Arrowhead indicates a vGluTI"^ 
sensory contact. 

(H) Section from a P21 RORa^^^; R26^' 

Emx1^^^ spinal cord stained with antibodies to 
DsRed (red), GFP (green) and vGluTI (blue), which 

confirms GFPVvGluTI"^ corticospinal contacts (arrows) onto RORa INs (red). Arrowhead indicates a vGluTI"^ sensory contact. 

(I) Schematic showing the RORa IN connectivity and an outline of the RORa IN spinal circuit for light touch. NB. Arrows to and from the cerebellum do not indicate 
direct connections. 

Scale bar, 2 ^im (G and H). See also Figure S6. 
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nocifensive behaviors are not (Figure 5), provides evidence that 
somatosensory inputs to the spinai cord are processed and 
transmitted in a modality-specific manner. This finding, together 
with studies showing excitatory neurons located dorsal to the 
RORot INs in lamina II transmit mechanical pain (Duan et al., 
2014), is consistent with the existence of ‘labeled’ interneuronal 
lines of transmission within the CNS, and it supports a model in 
which innocuous versus noxious touch modalities are routed 
through different interneuronal pathways in the spinal cord. We 
propose that light touch is gated by RORa INs in laminae lli/lll, 
whereas mechanical pain is processed by somatostatin'’’ neu- 
rons in lamina II, which is in accordance with the general organi- 
zation of nociceptive and mechanosensory projections in the 
dorsal spinal cord (Abraira and Ginty, 2013; Lechner and Lewin, 
2013; Todd, 2010). 

The RORa INs are innervated by multiple LTM subtypes (Fig- 
ures 3 and S3), indicating that they process multiple streams of 
mechanosensory input from the skin. In the trigeminal vibrissa 



system, tactile inputs converge on projection neurons that 
receive tactile information from slowly adapting Merkel cell affer- 
ents and from rapidly adapting lanceolate afferents (Sakurai 
et al., 2013). The RORa INs differ from these projection neurons 
in that they are exclusively local circuit neurons (Figures 1 and 
SI). The majority of RORa INs appear to be innervated by a 
single sensory afferent fiber type, with 29 of 31 recorded cells 
displaying a single depolarizing potential following dorsal root 
stimulation (Figure 4). Although this finding is consistent with 
innervation by a single mechanosensory cell type, we cannot 
exclude the possibility that individual RORa INs receive more 
than one type of mechanosensory input. For example, dual de- 
polarizing potentials consistent with monosynaptic A and poly- 
synaptic 0 inputs were detected in two RORa cells (Figure 4B), 
raising the possibility that some RORa cells merge sensory infor- 
mation from A-LTMs and C-LTMs. 

Our observation that the RORa INs are anatomically and 
molecularly heterogeneous (Figures SI and S2), suggests that 
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subpopulations of RORot INs may process different types of me- 
chanosensory information. Different LTM cell types are known to 
innervate discrete laminar territories in the dorsal horn (Li et al., 
201 1 ; Schouenborg, 2008), raising the possibility that different 
forms of touch, e.g., caress or pressure, activate different com- 
binations of RORa IN subtypes. For instance, PKCyV RORa"^ INs 
in lamina Hi are likely to be innervated by A6 afferents (Li et al., 
2011; Light and Perl, 1979a, b; Todd, 2010), while RORaV 
MafAVc-Mar neurons in lamina III would be activated by 
Ap-LTM inputs. Sensory coding by the RORa INs may therefore 
play an important role in discriminating different forms of touch. 

One question that arises from this study is whether innocuous 
touch is mediated solely by the RORa INs or whether other excit- 
atory IN cell types in the dorsal horn also contribute to the sense 
of light touch. The deficits in sensing light touch the RORa IN-ab- 
lated mice are very pronounced in the glabrous skin, indicating 
that light touch information from the glabrous skin is primarily 
gated via RORa INs. However, it is likely that other excitatory 
INs contribute to innocuous touch transmission from hairy skin, 
as RORa INs are not directly innervated by TH"^ C-LTMs (Fig- 
ure 3). RORa INs also lack inputs from Pacinian corpuscles (Fig- 
ure S3), with the MafA'^/c-Mar neurons in lamina IV that lack 
RORa expression being the most likely candidates for process- 
ing information from Pacinian LTMs. 

RORa INs: A Nexus for Integrating Cutaneous Touch and 
Descending Motor Control Signals 

Human studies have shown that the cutaneous sensory, visual, 
and vestibular systems all contribute to fine motor control and 
balance (Perry et al., 2000; Stal et al., 2003). It has also been 
shown in cats, that the corticospinal pathway displays functional 
convergence on cutaneous reflex pathways (Bretzner and Drew, 
2005; Lundberg and Voorhoeve, 1962). What was not clear was 
how information from these different pathways is merged to pro- 
vide a coherent set of commands to the spinal motor system. 
Our discovery that RORa INs are innervated by descending mo- 
tor pathways from the cortex and cerebellum suggest that much 
of this integration may occur at the level of the dorsal horn with 
the RORa INs playing a prominent role in integrating cutaneous 
sensory information with descending motor commands. We 
propose that the RORa INs, via their direct excitatory inputs to 
premotor neurons and motor neurons, function as the core inte- 
grative element of a sensorimotor circuit for corrective motor be- 
haviors and fine motor control (Figure 71). 

Mechanosensory feedback from the sole of the foot makes a 
major contribution to postural stability (Perry et al., 2000; Stal 
et al., 2003), with these pathways often being compromised in 
elderly and Parkinson’s patients that are prone to falling (Patel 
et al., 2009; Pratorius et al., 2003; Zia et al., 2003). With respect 
to the motor phenotype in mice lacking RORa INs, the loss of 
sensory feedback from Meissner corpuscles in the sole of the 
foot that are used to sense slippage would contribute to the in- 
crease in foot slips, and possible issues with balance, during 
beam walking. Deficits in detecting skin deformation and edges 
due to impaired signaling from Merkel cells and Ruffini endings 
may also be a factor. 

The RORa INs are innervated by LVST neurons, which points 
to an important role for the RORa INs in gating the output of 



LVN to the spinal cord. LVST neurons functionally facilitate 
limb extension (Grillner et al., 1971; Hultborn et al., 1976), in 
part via their excitatory actions on la reciprocal inhibitory neu- 
rons. Our observation that a subpopulation of V2b INs possess 
the features of la reciprocal inhibitory neurons (Zhang et al., 
2014), coupled with rabies-tracing experiments showing the 
RORa INs synapse onto V2b INs (FS and MG; unpublished 
data), raises the possibility that the vestibular control of limb 
extension may be partly mediated by the excitatory actions of 
RORa INs on V2b-derived la reciprocal inhibitory neurons that 
inhibit flexor motor neurons. 

RORa IN-dependent sensory feedback is largely dispensable 
for gross locomotor movements. This finding concurs with studies 
in the cat showing gross locomotor behaviors such as walking are 
largely independent of cutaneous feedback (Rossignol et al., 
2006). Nonetheless, light touch can exert strong phase-depen- 
dent effects on locomotion, as exemplified by the stumbling 
corrective response and paw shake reflex (Forssberg, 1 979; Que- 
vedo et al., 2005; Rossignol et al., 2006), with the RORa INs being 
potential candidates for mediating these corrective reflexes. The 
limited role that light touch plays in shaping coarse stepping 
movements suggests the sensorimotor system is functionally 
organized so that light touch is primarily used for corrective motor 
movements. In this way, vestibulospinal, rubrospinal, and cortico- 
spinal pathways that intersect with cutaneous sensory pathways 
in the dorsal spinal cord would have the capacity to elicit dynamic 
changes in posture and limb/foot/digit position without disrupting 
the overall locomotor pattern. Linder certain conditions, however, 
these pathways are able to alter the locomotor pattern. 

Sensorimotor circuits in the spinal cord and cerebellum are 
organized into action-based sensorimotor modules that consti- 
tute a functional scaffold for corrective and/or reflexive motor 
behaviors (Schouenborg, 2008). We propose that the RORa IN 
touch circuit represents an action-based sensorimotor circuit 
for corrective motor behaviors. Specifically, the RORa INs 
couple a tactile map of the body surface to motor pathways 
via their connections to elements of the spinal motor circuitry 
(Figure 6) and to postsynaptic dorsal column (PSDC) projection 
neurons (Figure S6) that relay tactile sensory information to the 
cerebellum (Ito, 2012). In this context, the RORa IN-PSDC-cere- 
bellar pathway would operate as an error detection system to 
correct motor movements in response to tactile sensory feed- 
back (Apps and Garwicz, 2005). In summary, the RORa IN- 
PSDC-cerebellum pathway and the reciprocal LVST-RORa IN 
pathway constitute a spinocerebellar feedback loop that utilizes 
cutaneous sensorimotor feedback to shape the fine corrective 
movements animals use for dynamic motor control. 

EXPERIMENTAL PROCEDURES 
Mouse Lines 

Multiple mouse lines were used for this study. Mice were maintained on a 
mixed background and littermates were used as controls for all experiments. 
RORa.^^^ mice were kindly provided by Dr. Dennis O’Leary (Chou et al., 2013). 
Littermates lacking the Cdx2::FlpO allele were used as controls for all 
behavioral experiments. Expression analyses were performed using the 
^ 26 LSL-tdTomato (^adisen et al., 2010), (FRT-stop-FRT 

deleted Duan et al. 2014), and mice {Stam et al., 2012). 

y:j 2 gds-HTB ^ 26 ‘-s>--tva (Seidler et al., 2008) were used for 
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transsynaptic tracing and morphological analysis with EnvA-pseudotyped 
rabies virus. and Cdx2::FlpO mouse lines were used for all the 

behavioral analyses. 

Immunohistochemistry 

Immunohistochemical analyses were performed on cryostat sections of fixed 
tissues using previously described methods (Bourane et al., 2007; Gross et al., 
2002 ). 

In Situ Hybridization 

In situ hybridization was performed as previously described (Bourane et al., 
2007). 

c-Fos Induction 

P6 RORa^'^^\ mice, restrained in a specially designed enclosure, 

were acclimated for 10 min prior to stimulation. For light brush stimulation, a 
soft paint brush was used to gently stroke the plantar surface (glabrous) or dor- 
sal surface (hairy) of the hindpaw for 45 min at approximately 0.5 Hz. To acti- 
vate pain pathways, 6 |.il of capsaicin (1 mg in 10 ml saline, 7% Tween-80) was 
injected subcutaneously into the plantar surface of the hindpaw. A 60 min 
chase time was included between stimulation and sacrificing the animals. Spi- 
nal cords were immediately dissected and fixed in 4% paraformaldehyde/ 
PBS. Frozen sections from the lumbar spinal cord were immunostained with 
antibodies specific for c-Fos and p-galactosidase. 

Rabies Virus Tracing 

Injections of pseudotyped rabies virus into the lumbar spinal cord of PI 4 
RORa^^^; mice were used to examine RORa IN morphology. For 

transsynaptic tracing studies, injections were made into the lumbar cord of 
RORa^^^; Cdx2::FlpO; animals at P5 or P10. For muscle injections, 

single muscles were injected with G-protein-deleted-mCherry rabies virus ac- 
cording to (Stepien et al., 2010). 

Behavioral Testing and Analysis 

All the behavioral tests were performed blind to the genotype of the animals. 
Animal experiments were conducted according to NIH guidelines using proto- 
cols approved by the Salk Institute for Biological Studies lACUC. Detailed pro- 
tocols for all behavioral tests are described in Extended Experimental 
Procedures. 

Electrophysiology 

Whole-cell recordings and dorsal root stimulation were performed using 
sagittal hemicords prepared from postnatal P5-P21 RORa^^^; 
mice as described by (Torsney and MacDermott, 2006), with minor modifica- 
tions. The CST was stimulated with a concentric bipolar electrode (FHC) posi- 
tioned 3-4 mm rostral to the recorded neuron. Stimulation was performed at 
2 Hz (1 OOjiA, 0.1 ms) to exclude retrograde axonal transmission. See Extended 
Experimental Procedures. 

Quantitative Analysis and Statistics 

Cell counts were determined by analyzing 3-6 spinal cords (5-10 sections 
each) per genotype. All data are presented as the mean ± SEM with n indi- 
cating the number of mice analyzed. Statistical analyses were performed by 
two-tailed, unpaired Student’s t test, p values below 0.05 were considered 
to be statistically significant. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures and 
six figures and can be found with this article online at http://dx.doi.org/10. 
1016/j.cell.2015.01.011. 
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SUMMARY 

Optimally orchestrating complex behavioral states, 
such as the pursuit and consumption of food, is crit- 
ical for an organism’s survival. The lateral hypothal- 
amus (LH) is a neuroanatomical region essential for 
appetitive and consummatory behaviors, but 
whether individual neurons within the LH differen- 
tially contribute to these interconnected processes 
is unknown. Here, we show that selective optoge- 
netic stimulation of a molecularly defined subset 
of LH GABAergic (1/gaf-expressing) neurons en- 
hances both appetitive and consummatory behav- 
iors, whereas genetic ablation of these neurons 
reduced these phenotypes. Furthermore, this tar- 
geted LH subpopulation is distinct from cells con- 
taining the feeding-related neuropeptides, melanin- 
concentrating hormone (MCH), and orexin (Orx). 
Employing in vivo calcium imaging in freely 
behaving mice to record activity dynamics from 
hundreds of cells, we identified individual LH 
GABAergic neurons that preferentially encode as- 
pects of either appetitive or consummatory behav- 
iors, but rarely both. These tightly regulated, yet 
highly intertwined, behavioral processes are thus 
dissociable at the cellular level. 



INTRODUCTION 

Complex, but evolutionary well-conserved, behavioral states, 
such as those related to feeding, require both precurrent and 
consummatory responses (Sherrington, 1906). These oftentimes 
sequential reactions, which are historically conceptualized as 
appetitive and consummatory behaviors (Ball and Balthazart, 
2008; Craig, 1917; Lorenz, 1950; Tinbergen, 1951), represent 
highly intertwined processes that are not fully distinguished at 
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the neural level. The lateral hypothalamus (LH), a critical modu- 
lator of both appetitive and consummatory processes, is a het- 
erogeneous brain area containing numerous genetically distinct 
cell populations that utilize various signaling modalities (Ber- 
thoud and Munzberg, 2011). Gene expression patterns within 
the LH suggest that individual neurons likely release either 
inhibitory or excitatory neurotransmitters, y-aminobutyric acid 
(GABA), and glutamate, as well as a host of neuropeptides 
(Hokfelt et al., 2000; Lein et al., 2007), implying that identifiable 
subdivisions within the global LH neuronal network can be 
genetically targeted. Electrical stimulation of the LH, which 
nonspecifically activates various cell types and processes, 
evokes voracious consummatory feeding, as well as appetitive 
reward-related behaviors (Hoebel and Teitelbaum, 1962; Mar- 
gules and Olds, 1962; Olds and Milner, 1954; Wise, 1971), 
whereas ablation of the region results in emaciation and aphagia 
(Anand and Brobeck, 1951; Hoebel, 1 965). Moreover, the activity 
of LH neurons changes in response to food and associated stim- 
uli (Ono et al., 1981, 1986). These electrically evoked behaviors 
and feeding-specific activity patterns were discovered in 
numerous species of the Kingdom Animalia, from lizards (Mo- 
lina-Borja and Gomez-Soutullo, 1989) to humans (Quaade 
et al., 1974). These past findings emphasize the importance 
and evolutionary conservation of the LH for regulating these sur- 
vival-oriented behaviors. However, given the heterogeneous 
cellular composition of the LH (Adamantidis and de Lecea, 
2009; Karnani et al., 2013; Knight et al., 2012), and the fact that 
multiple fibers of passage traverse this region (Hahn and Swan- 
son, 2012), classical electrical stimulation, lesion, or electro- 
physiological recording studies are unable to determine whether 
genetically defined cell types, such as LH GABAergic neurons, 
encode, and regulate precise aspects of appetitive food-seeking 
and/or consummatory behaviors. 

Neural circuit tracing and manipulation experiments revealed 
that optogenetic modulation of LH glutamatergic neurons influ- 
ences feeding and motivated behavioral responding (Jennings 
et al., 2013a). In the current study, we examined if molecularly 
defined LH neurons that express the gene for the vesicular 
GABA transporter {Vgat), and thus synthesize and release 
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GABA, selectively promote and encode appetitive and consum- 
matory behaviors. 

RESULTS 

Optogenetic Stimulation and Inhibition of LH GABAergic 
Neurons Bidirectionally Modulate Feeding and Reward- 
Seeking Behavior 

Given that electrical stimulation of the LH produces multiple 
behavioral responses, Including feeding and positive reinforce- 
ment (Hoebel and Teitelbaum, 1962; Margules and Olds, 
1962), and that the highest concentration of GABA in the hypo- 
thalamus Is located within the anterior portions of the LH (Kimura 
and Kuriyama, 1975), we first explored the functional role of LH 
GABAergic neurons In modulating feeding and reward-related 
behaviors by using optogenetic techniques. Applying estab- 
lished viral procedures (Jennings et al., 2013b), we first ex- 
pressed channelrhodopsin-2 conjugated to enhanced yellow 
fluorescent protein (AAV5-EF1 a-DIO-hChR2(H134R)-eYFP) 
selectively In GABAergic neurons located within the anterior 
and dorsolateral subdivisions of the LH (Figures 1A and IB) in 
Vgat-IRES-Cre mice (Vong et al., 2011) and Implanted optical 
fibers directly above the LH for somata photostimulatlon (Fig- 
ure 1C; Figure SI A available online). Approximately 4 weeks 
after surgery, we tested whether direct photostimulation of LH 
GABAergic neurons influenced feeding and reward-related 
behavioral phenotypes in ad-libitum-fed mice. Photoactivation 
of these neurons at 20 Hz significantly Increased the time spent 
In a designated food area (Figures 1 D and 1 E), food consumption 
(Figure 1 F), time spent In a location paired with photostimulation 
(Figure 1G), and optical self-stimulation behavior (Figure 1H). 
Next, we tested whether photoinhibitlon of LH GABAergic 
neurons disrupted feeding and reward-related behaviors in 
food-restricted mice. Utilizing similar procedures as described 
above, we targeted a modified variant (Mattls et al., 2012) 
of the Inhibitory opsin, archaerhodopsin-3 (AAV5-EF1a-DIO- 
eArch3.0-eYFP; Chow et al., 2010), to LH GABAergic neurons 
In Vgat-IRES-Cre mice (Figures 1 1-1 K; Figure SI B). Photoinhibi- 
tlon of LH GABAergic neurons led to significant reductions In 
time spent in a designated food area (Figures 1 L and 1 M), food 
consumption (Figure IN), and time spent in a location paired 
with photoinhibition (Figure 10). These data indicate that selec- 
tive optogenetic modulation of neurochemically distinct LH 
GABAergic neurons (Figures S1C-S1Q) influences both feeding 
and reward-related phenotypes, supporting the idea that both 
appetitive and consummatory behavioral processes are repre- 
sented In the LH by GABAergic neurons. 

Chemogenetic Activation of LH GABAergic Neurons 
Enhances Consummatory Behaviors 

To expand upon the acute behavioral effects observed from op- 
togenetic manipulations (Figure 1), we Investigated If sustained 
activation of LH GABAergic neurons, over a longer timescale. 
Influenced consumption and work effort to earn a caloric reward. 
Thus, we virally targeted the Gq-coupled excitatory designer re- 
ceptor exclusively activated by designer drugs (DREADD), 
hM3Dq, to LH GABAergic neurons by injecting the Cre-inducible 
viral construct, AAV8-hSyn-DIO-hM3D(Gq)-mCherry (Krashes 



et al., 2011), Into similar target zones within the LH of Vgat- 
IRES-Cre mice (LH°'^®'^::hM3Dq; Figures 2A and 2B; Figures 
S2A-S2G). The inert molecule, clozapIne-N-oxIde (CNO), selec- 
tively binds to hM3Dq and activates neurons through Gq 
signaling pathways (Alexander et al., 2009). Therefore, to verify 
CNO-medlated activation in LH°'^®'^::hM3Dq neurons, we per- 
formed whole-cell recordings In brain slices and found that the 
spontaneous firing rate of a subset of LH°'^®'^::hM3Dq neurons 
significantly Increased upon CNO (5 pM) bath application (Fig- 
ures 2C and 2D). Additionally, we examined whether In vivo 
stimulation of LH GABAergic neurons, via systemic CNO admin- 
istration, enhanced the expression of Fos, a marker for neuronal 
activity, within the LH. Intraperitoneal (l.p.) Injections of CNO 
(1 mg/kg) In LH°'^®'^::hM3Dq mice significantly Increased Fos 
expression In the LH compared to LH°'^®'°‘::Control mice Injected 
with CNO (Figures 2E-2I). To determine whether DREADD- 
mediated activation of LH GABAergic neurons affected consum- 
matory behavior, mice were trained in a free-access caloric 
consumption task (1-hr duration) that permitted the quantifica- 
tion of lick responses at the delivery spout for a palatable caloric 
liquid reward. Chemogenetic activation of LH GABAergic neu- 
rons in LH°'°‘®'^::hM3Dq mice via CNO (1 mg/kg, i.p.) injection 
led to a significant Increase In the number of lick (consummatory) 
responses when compared to controls (Figures 2J and 2K). In 
addition, CNO administration 45 min prior to a 1 hr free-access 
feeding task (Figure 2L) enhanced food intake in ad-libitum-fed 
l_l_|GABA..[^l^2Qq (pjgure 2M; see the Extended Experi- 
mental Procedures). 

To examine whether chemogenetic activation of LH 
GABAergic neurons alters work effort exerted to earn a caloric 
reward, we initially trained mice on a fixed ratio of one (FR1) 
schedule, where each active nose poke resulted In the delivery 
of a palatable and calorie-dense liquid. Following stable behav- 
ioral responding, on subsequent sessions, LH°'^®'^::hM3Dq and 
LHGABA-.Qoiq^rol ppjQg ^e|.g either administered saline or CNG 
(counterbalanced) 45 min prior to progressive ratio 3 (PR3) test 
sessions, which is an established behavioral assay for measuring 
an animal’s motivation to obtain caloric rewards (Krashes et al., 
2011). Intriguingly, activation of these cells significantly 
Increased lick responses (metrics of consumption; Figures 2N 
and 20), but did not alter the number of active nose pokes 
nor the break point (metrics of motivation; Figures 2P and 2Q). 
Anecdotally, we observed that LHG'°‘®'^::hM3Dq mice treated 
with CNO spent majority of the time licking at the reward 
receptacle (despite reward delivery being contingent upon 
nose poke responding), suggesting that bulk chemogenetic acti- 
vation of these neurons may override optimal behavioral perfor- 
mance by biasing behavior toward aspects of consumption. 
Taken together, these data Indicate that chemogenetic activa- 
tion of LH GABAergic neurons enhances consummatory 
behaviors. 

l/igrat-Targeted LH GABAergic Neurons Are Molecularly 
Distinct from MCH and Orx Cells 

The LH exclusively houses two separate molecularly defined cell 
types, melanin-concentrating hormone (MCH) and orexin (Orx) 
neurons (Ellas et al., 1998), which are known to have diverse 
roles In regulating food intake, energy balance, reward, and 
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Figure 1. Optogenetic Modulation of LH 
GABAergic Neurons Bidirectionally Modu- 
lates Feeding and Reward-Related Behav- 
iors 

(A) Scheme for viral targeting of AAV5-EF1a-DIO- 
ChR2-eYFP to the LH of Vgat-IRES-Cre animals. 

(B) 20x confocal image depicting ChR2-eYFP 
expression in LH GABAergic neurons. EP, en- 
topeduncular nucleus; LH, lateral hypothalamus; 
Fx, fornix; DMH, dorsomedial hypothalamic nu- 
cleus; VMH, ventromedial hypothalamic nucleus; 
Arc, arcuate nucleus; 3v, third ventricle; D, dorsal; 
L, lateral; M, medial; V, ventral. Scale bar, 200 |.im. 

(C) Diagram for photostimulation of LH GABAergic 
neurons. 

(D) Color map encoding spatial location from an 
ad-lib-fed LH^®*::ChR2 mouse during the free- 
access feeding paradigm. 

(E) Photostimulation of LH GABAergic neurons 
significantly increased time spent in the food zone 
compared to controls and time epochs without 
photostimulation (n = 5 mice per group; p 2,27 = 
86.24; p< 0.0001). 

(F) 20 Hz photostimulation significantly increased 
food intake in LH*^®^::ChR2 mice compared to 
controls and time epochs without photostimulation 
(n = 5 mice per group; F 2 , 2 ? = 17.05; p < 0.0001). 

(G) LH'^‘^®*::ChR2 mice spent significantly more 
time in the chamber paired with photostimulation 
compared to controls (n = 5 mice per group; ts = 
6.796; p< 0.0001). 

(H) LH'^‘^^‘^::ChR2 mice nose poked significantly 
more for 20 Hz photostimulation compared to 
controls (n = 4 mice per group; ts = 5.744; p = 
0 . 0012 ). 

(I) Viral targeting of AAV5-EF1 a-DIO-eArch3.0- 
eYFP into the LH of Vgat-IRES-Cre mice. 

(J) 20x confocal image showing eArch3.0-eYFP 
expression in the LH of a Vgat-IRES-Cre mouse. 
Scale bar, 200 iim. 

(K) Illustration for somata LH GABAergic photo- 
inhibition. 

(L) Color map encoding spatial location of an 
example food-deprived LH'^^^^iieArchS.O animal 
during the free-access feeding paradigm. 

(M) Upon photoinhibition exposure, LH*^^®^:: 
eArchS.O animals spent significantly less time in 
the food zone compared to controls (n = 5 mice per 
group; F-i -te = 9.39; p = 0.007). 

(N) Photoinhibition of LH GABAergic neurons 
significantly suppressed food intake in food- 
restricted mice when compared to controls and 
time epochs without photoinhibition (n = 5 mice per 
group: F-i 16 = 5.43; p = 0.033). 

(O) LH^^^^iieArchS.O mice spent significantly less time in the photoinhibition-paired chamber compared to controls (n = 5 mice per group; ts = 4.512; p = 0.002). 
All values are mean ± SEM. Student’s t test or two-way ANOVA; *p < 0.05, **p < 0.01 , ***p < 0.001 . See also Figure S1 . 
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sleep-wakefulness (Jego et al., 2013; KarnanI et al., 2011; Sa- 
kural et al., 1998; Whiddon and Palmiter, 2013). Because MCH 
and Orx are thought to promote aspects of feeding and 
reward-related behaviors, we next assessed whether Vgat-tar- 
geted neurons within the LH coexpress either of these orexigenic 
neuropeptides. To achieve this, we Immunolabeled both pep- 
tlde-produclng neuronal populations In Vgat-IRES-Cre mice 
that endogenously express eYFP In Vgat* cells (l/gaf-eYFP) 



and determined whether Orx and/or MCH neurons colocallze 
with Vgat *' cells in the LH. Strikingly, Immunostaining for MCH 
and Orx In l/gaf-eYFP {Vgat-IRES-Cre mice crossed to a AI3 re- 
porter line; Madlsen et al., 2010) brain slices revealed that Vgat* 
LH neurons do not coexpress either of these neuropeptides, 
signifying that these \/gaf-targeted neurons In the LH represent 
a neurochemically distinct GABAergic subpopulatlon that is 
separate from MCH and Orx cells (Figures 3A-3F; Figure S3). 
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Figure 2. Bulk Chemogenetic Activation of LH GABAergic Neurons Enhances Consummatory Behaviors 

(A) Viral targeting of AAV8-hSyn-DIO-hM3D{Gq)-mCherry into the LH of Vgat-IRES-Cre mice. 

(B) 20x confocal image showing hM3Dq-mCherry expression in the LH of a Vgat-IRES-Cre animal. Scale bar, 200 |.im. 

(C) Example current-clamp traces from a LH‘^^®'^::hM3Dq brain slice before (baseline) and after 5 ^iM CNO demonstrating DREADD-mediated action potentials 
(bottom). 

(D) CNO significantly increases spontaneous firing of LH'^^^^iihMODq neurons (n = 3 cells; n = 3 mice; t 4 = 12.370; p = 0.0002). 

(E) 20x confocal image from an example LH^®^::hM3Dq animal, where Fos-induction was assessed 2 hr after CNO (1 mg/kg; i.p.) injection. Scale bar, 500 |.im. 
(F-H) 63x confocal images displaying Fos immunoreactivity (F) and hM3Dq-mCherry expression (G) in a LH^®^::hM3Dq animal injected with CNO. (H) Merged 
image of (F) and (G) showing colocalization of Fos and hM3Dq-mCherry expression as indicated by white arrows. Scale bars, 20 i-im. 

(I) CNO administration in LH^®^::hM3Dq mice significantly increases Fos expression in the LH compared to controls and neighboring regions (n = 3 LH sections; 
n = 3 mice per group; Fi e = 46.00; p < 0.0001). 

(J) Cumulative lick responses from individual LH°^®^::hM3Dq and LH°'^®'^::Control mice during a single 1 hr free-access caloric consumption task. 

(K) Systemic CNO application significantly increases lick responses in LH'^'^®^::hM3Dq mice compared to LH'^'^®'^::Control mice and saline injections during a 
free-access caloric consumption task (n = 6 mice per group; Fi 20 = 8.12; p = 0.01). 

(L) Example color maps from LH'^'^®^::hM3Dq (top) and LH‘^'^®^::Control mice (bottom) during the free-access feeding task. 

(M) Systemic application of CNO in LH*^^®^::hM3Dq mice significantly increased food consumption during the free-access feeding task when compared to 
controls and saline injections (n = 6 mice per group; F 120 = 6.37; p = 0.02). 

(N) Cumulative lick responses from example LH^®^::hM3Dq and LH^®^::Control animals during a single 1 hr progressive ratio 3 (PR3) session. 

(legend continued on next page) 
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Although these neuropeptide-producing neurons have been 
implicated as crucial modulators of feeding, selective activation 
of 1/gaf-targeted neurons within the LH does not directly stimu- 
late MCH and Orx cells to produce feeding and reward-related 
behaviors (Figures S1 F-S1Q and S2B-S2G). 

Genetic Ablation of LH GABAergic Neurons Reduces 
Consummatory Behaviors and Motivation to Obtain 
Caloric Rewards 

Next, we assessed the necessity of LFI GABAergic neurons for 
regulating these feeding-related processes with cell-type-spe- 
cific ablation methods. To selectively ablate LFi GABAergic neu- 
rons, we injected a Cre-inducible viral construct coding for 
taCasp3-2A-TEVp (AAV2-FLEX-taCasp3-TEVp) into the LH of 
Vgat-IRES-Cre mice (LH°'^®'^::taCasp3; Figure 3G) (Yang et al., 
2013). In situ hybridization for glutamic acid decarboxylase 
(GAD67), a marker for GABAergic neurons (Erlander et al., 
1991), revealed a significant reduction in the number of LH 
GABAergic neurons following bilateral LH injections of 
AAV-FLEX-taCasp3-TEVp, and no change in the number of 
GABAergic neurons within neighboring regions, including the 
entopeduncular nucleus (EP) and ventromedial hypothalamus 
(VMH; Figures 3H-3J). Previous studies demonstrated that 
chemical or electrolytic lesions of the LH can result in significant 
aphagia and weight loss (Harrell et al., 1975; Schallert and 
Whishaw, 1978). Therefore, we monitored the daily body weight 
of LH°'^®'^::taCasp3 and LH°'^®'^::Control mice, while both 
groups were maintained on a calorie-dense diet for 60 days. 
Ablation of LH GABAergic neurons significantly blunted weight 
gain in LH°'^®'^::taCasp3 mice compared to controls (Figure 3K). 
Additionally, LH°'^®'°‘::taCasp3 mice displayed a significant 
reduction in daily food intake measured at 1 month postvirus in- 
jection (Figure 3L). 

Next, we tested the effects of LH GABAergic neuron ablation 
on acute feeding and motivation to obtain caloric reward. 
Food-restricted LH°'^®'^::taCasp3 mice showed reduced food 
intake in the acute free-access feeding assay (Figures 3M and 
3N) and diminished lick responding in the free-access caloric 
consumption task compared to controls (Figure 30). When 
tested on the PR3 task, LH°'^®'^::taCasp3 mice exhibited 
reduced lick responding compared to controls (Figures 3P and 
3Q). Congruently, LH°'^®'°‘::taCasp3 mice showed a significant 
reduction in metrics of appetitive responding (active nose pokes 
and break points) compared to controls (Figures 3R-3T). 
Furthermore, genetic ablation of LH GABAergic neurons did 
not significantly alter the number of MCH or Orx cells In the LH 
(Figures S4A-S4F), corroborating our findings that \/gaf-targeted 
LH neurons are a separate population from MCH and Orx cells 
(Figures 3A-3F; Figure S3). Lastly, LH°'^®'^::taCasp3 mice did 
not display locomotor deficits or anxiety-related phenotypes 
when tested In an open-field test (Figures S4G and S4H). 
Together, these data highlight a causal and specific role for LH 



GABAergic neurons in regulating appetitive responding, con- 
sumption, and energy balance. 

In Vivo Ca^'^ Imaging in Large Populations of LH 
GABAergic Neurons 

The optogenetic, chemogenetic, and cell ablation experiments 
outlined above demonstrate that bulk modulation of LH 
GABAergic neurons causally regulates appetitive and consum- 
matory behavioral output. Though these approaches provide 
Important causal Information for linking cell-type-specific func- 
tion to behavior, bulk modulation of even genetically defined 
neurons fails to account for the high degree of functional diver- 
sity within a targeted cell population, and they do not accurately 
reflect endogenous cellular activity dynamics that underlie com- 
plex behaviors. Therefore, it is unclear whether appetitive and 
consummatory processes are orchestrated by functionally 
discrete LH GABAergic subpopulations or whether individual 
neurons are participant in aspects of both appetitive and 
consummatory behaviors. To examine this in detail, we applied 
in vivo microendoscopic imaging strategies (Barretto et al., 
2011; Flusberg et al., 2008; Ghosh et al., 2011; Jennings and 
Stuber, 2014) to resolve somatic Ca^"^ activity dynamics from 
hundreds of LH GABAergic neurons (n = 743 cells) In freely 
behaving mice (n = 6 mice; Movie SI). First, we expressed the 
green fluorescent sensor of neuronal activity, GCaMPSm, in LH 
GABAergic neurons by injecting a Cre-inducible AAV viral 
construct (AAVDJ-EF1a-DIO-GCaMP6m) Into the LH of Vgat- 
IRES-Cre mice (LH°'^®'°‘::GCaMP6m; Figures 4A-4C; Figures 
S5A-S5L). To circumvent the optical aberrations and the 
turbidity of tissue that typically preclude imaging in deep brain 
structures like the LH (^5 mm deep), we implanted 8-mm-long 
microendoscopes (0.5 mm diameter; consisting of a relay lens 
fused to doublet gradient refractive index microlenses) directly 
above the LH for optical detection of GCaMPSm fluorescence 
emission (Figures 4D-4F; Figure S5M). Next, we Interfaced the 
Implanted microendoscope with a detachable miniaturized fluo- 
rescence microscope (Ghosh et al., 2011; Zlv et al., 2013) to 
visualize Ca^"^ signals from large populations of LH GABAergic 
neurons in freely moving LH°'°‘®'^::GCaMP6m mice (Figure 4G). 
Furthermore, applying this technique in conjunction with estab- 
lished computational algorithms (Mukamel et al., 2009), we 
were able to Identify and track the Ca^"^ activity from Individual 
LH GABAergic neurons within single recording sessions and 
across multiple days and behavioral tasks (Figures 4H and 41; 
see the Extended Experimental Procedures). 

Food-Location Coding Profiles of LH GABAergic 
Neurons 

Because the activity of unclassified LH neurons can be modu- 
lated In response to nutrient-related information (Ono et al., 
1981,1 986), we first determined whether LH GABAergic ensem- 
bles are engaged when food is readily available. Thus, we 



(O) Systemic CNO application significantiy increased lick responses in LH°*®*::hM3Dq mice during the PR3 paradigm when compared to controis and saiine 
injections (n = 6 mice per group; Fi 20 = 24.37; p < 0.0001). 

(P) Nose poke responses from exampie LH“®*;;hM3Dq and LH°*®*:;Control mice during a singie PR3 session. 

(Q) CNO administration did not significantiy affect nose poke responses during the PR3 session (n = 6 mice per group; Fi 20 = 1-47; p = 0.24). 

Aii values are mean ± SEM. Student’s t test or two-way ANOVA. See also Figure S2. 
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Figure 3. Genetic Ablation of LH Vgat Neu- 
rons that Are Separate from MCH and 
Orx Cells Attenuates Weight Gain, Food 
Seeking, and Consummatory Behaviors 

(A-C) 20x confocal images of the dorsolateral LH 
demonstrating the absence of colocalization be- 
tween Vgat-eYfP and MCH immunostaining (n = 
223 ± 13.77 Vgat-eYfP cells per mm^, n = 76 ± 
1 1 .37 MCH cells per mm^, and 0% overlap, n = 3 
LH sections: n = 3 Vgat-eYfP mice). Scale bars, 
100 iim. 

(D-F) Representative 20x confocal images from a 
Vgaf-eYFP brain slice (D) immunostained for Orx in 
red (E) displaying a lack of eYFP and Orx-immu- 
noreactivity coexpression in Vgat * LH cells (F); n = 
266 ± 24.2 Vgat-eYfP cells per mm^, n = 200 ± 
4.41 Orx cells per mm^, and 0% overlap; n = 3 LH 
sections; n = 3 Vgat-eYfP mice). Scale bars, 
100 i-im. 

(G) Viral injection of AAV2-FLEX-taCasp3-TEVp 
into the LH of Vgat-IRES-Cre mice. 

(H and I) 20x confocal images demonstrating 
decreased GAD67 expression in LH^®^::taCasp3 
mice (H) compared to LH*^^®^::Control animals (I). 
Scale bars, 200 )im. 

(J) GAD67 expression is significantly decreased in 
the LH of LH‘^^®^::taCasp3 animals compared to 
LH'^ABA..QQ|.^^|.Qlg Ablation of LH GABAergic 
neurons does not significantly alter GAD67- 
expression levels within the VMH and EP of 
LH^ABA..^gQggp3 LH‘^ABA..QQg^|.Q| (n = 3 

LH sections; n = 3 mice per group; F2,is = 5.58; 

p = 0.01). 

(K) Ablation of LH GABAergic neurons significantly 
blunted weight gain induced from a calorie-dense 
diet (n = 7 mice per group; F1720 = 377.01; p = 
0.0174). 

(L) Four weeks after taCasp3-TEVp viral injection, 
LH GABAergic cell death significantly reduced 
daily consumption of a calorie-dense diet (n = 7 
mice per group; ti2 = 2.597; p = 0.0234). 

(M) Color map locations from example LH'^aba.. 
taCasp3 (top) and LH^aba..qqj.^^|.q| (bottom) mice 
during the free-access feeding paradigm. 

(N) LH'^®A;;taQasp3 mice display significant de- 
creases in food consumption when compared to 
controls during the free-access feeding paradigm 
(n = 7 mice per group; ti2 = 3.239; p = 0.0071). 

(O) LH GABAergic neuron ablation significantly 
decreases lick responses in LH°ABA..^gQggp3 
animals compared to controls during a free- 
access caloric consumption paradigm (n = 7 mice 
per group: ti2 = 5.3320; p = 0.0002). 

(P) Lick responses from example LH*^ABA..^gQggp3 
and LH‘^®A;;Qontrol animals during a single (1 hr) 
PR3 session. 

"Controls during the PR3 task 



(Q) Ablation of LH GABAergic neurons significantly decreases lick responses in LH ::taCasp3 animals compared to LH 
(n = 6 mice per group; t^o = 3.024; p = 0.012). 

(R) Nose poke responses from example LH^^ABA-.^gQggp^ g^^ LH^^aba-.q^^^^qi anjmais during a single PR3 session. 

(S) Ablation of LH GABAergic neurons significantly decreases nose poke responding in LH^^ABA-.^gQggpg animals compared to LH°ABA..QQg^|.Q|g during the PR3 
task (n = 6 mice per group; tio = 2.773; p = 0.019). 

(T) LH°ABA..^gQggp3 fyijQg display significantly lower break points when compared to LH'^ABA..QQ^^|.Q|g during the PR3 session (n = 6 mice per group; tio = 2.692; 

p = 0.022). 

All values are mean ± SEM. Student’s t test or two-way ANOVA. See also Figures S3 and S4. 
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Figure 4. In Vivo Ca^* Imaging of LH 
GABAergic Neurons in Freely Moving Mice 

(A) Diagram showing unilateral viral injection of 
AAVDJ-EF1 a-DI0-GCaMP6m into the LH of Vgat- 
IRES-Cre mice. 

(B) lOx confocal image of GCaMP6m expression 
in LH GABAergic neurons. Scale bar, 500 pm. 

(C) 63x confocal image demonstrating stable and 
healthy GCaMP6m expression in LH GABAergic 
neurons several months after viral delivery. Scale 
bar, 20 pm. 

(D) Integration of the miniepifluorescence micro- 





Freely-moving calcium imaging 



characterized the response profiles of individual LH GABAergic 
neurons during the same tree-access feeding task used 
above and recorded Ca^"^ signals as food-restricted 
GCaMP6m mice freely explored an arena that possessed 
discrete food-containing zones (FZ) in two of the outer corners 
(Figure 5A; Movie S2). This approach allowed for a visual repre- 
sentation of discrete spatial Ca^"^ responses (Figures 5B and 5C) 
with some neurons exhibiting increased Ca^"^ spiking in the pres- 
ence of food (Figures 5B and 5D) and others showing decreased 
Ca^"^ transients while the animal was in the FZ (Figures 5C and 
5E). Further, we calculated a response ratio for each LH 
GABAergic neuron based on the frequency of Ca^"^ responses 
in the FZ over the frequency in nonfood zones (FZ to NFZ; Fig- 
ure 5F). We categorized response profiles from individual neu- 
rons as food zone excited (FZe) or food zone inhibited (FZI) If their 
log FZ/NFZ ratios were >±1 SD of the mean (0.0 ± 0.3) of the 
entire population (n = 612 cells). Total Ca^"^ activity from both 
groups decreased over time (Figure S6); however, average 
responses to FZ and NFZ areas within FZe and FZi neurons re- 
vealed significant differences In their respective directions (Fig- 



ures 5G and 5H), supporting the design 
of our classification model. To spatially 
represent the response profiles of each 
neuron, we pseudocolored individual 
cells from an example animal’s cell map 
by its log-transformed response ratio 
and observed that cells of differing 
response profiles intermingled with each other rather than segre- 
gating into separate clusters (Figure 51). These data demonstrate 
that the endogenous activity dynamics of subsets of LH 
GABAergic neurons are preferentially modulated in environ- 
mental locations paired with food. 

Individual LH GABAergic Neurons Selectively Encode 
Aspects of Appetitive or Consummatory Behaviors 

Next, we sought to determine whether individual LH GABAergic 
neurons selectively compute aspects of appetitive and/or 
consummatory behaviors. Therefore, we tracked Ca^’’’ dy- 
namics in LH GABAergic neurons during the same PR3 task 
described above, where LH°'^®'^::GCaMP6m mice worked 
to obtain a caloric liquid reward. Numerous individual LH 
GABAergic neurons showed time-locked Ca^’’’ transients in 
response to either the first lick after reward delivery (consum- 
matory responses) or unreinforced active nose pokes (appeti- 
tive responses; Figures 6A and 6B; Movie S3). Nonconsumma- 
tory licks, those following unreinforced nose pokes, also evoked 
Ca^’’’ changes but to a much lesser degree (Figure S7A), 
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scope with the microendoscope for deep-brain 
imaging of LH GABAergic neurons expressing 
GCaMP6m. 

(E) 20x confocal image showing lens (micro- 
endoscope) placement and GCaMP6m-express- 
ing neurons within the LH. Focal plane in tissue is 
300 pm from the bottom of the lens as indicated by 
the red box. Scale bar, 500 pm. 

(F) In vivo miniepifluorescence image of 
GCaMPOm expression in the LH. Green arrow di- 
rects to a LH GABAergic neuron expressing 
GCaMPOm. Red arrow highlights a blood vessel. 
Scale bar, 100 pm. 

(G) Illustration of the in vivo Ca^* imaging setup. 

(H) Schematized cell map of an example animal’s 
LH GABAergic neurons visualized during free- 
access feeding and PR3 tasks. The same neurons 
can be tracked between sessions (colored cells). 

(I) Ca^"’’ traces of individual neurons tracked in (H). 
See also Figure S5 and Movie SI . 
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Figure 5. Subsets of LH GABAergic Neurons Display Enhanced or Reduced Activity to Environmental Locations Containing Food 

(A) Example trace of animal’s location during the free-access feeding task. 

(B and C) Spatial Ca^"^ activity maps of a single FZe (B) and FZi (C) cell. Behavioral arena is divided into 0.33 x 0.33 cm bins, where number of Ca^"^ events per unit 
time is represented in color. 

(D and E) Example Ca^"^ traces from one FZe (orange; D) and one FZi (blue; E) cell during the free-access feeding task in relation to animal’s location and state of 
interaction with food. 

(F) Distribution of food zone (FZ) responses from all detected cells (mean = 0.0; SD = 0.3). FZe cells are classified as above one SD from the mean. FZi cells fall 
below one SD from the mean (n = 612 total cells imaged; n = 87 FZe cells; n = 73 FZi cells; n = 6 mice). 

(G and H) Average Ca^"^ transients per min for FZe (G) and FZi (H) cells in each zone (n = 87 FZe cells; tse = 14.92; p < 0.0001; n = 73 FZi cells; = 15.08, 

p< 0.0001). 

(I) Example cell map with cells’ color encoding response to FZ. Scale bar, 100 |.im. 

Error bars represent SEM. Student's t test. See also Figure S6 and Movie S2. 



signifying that these consummatory responses depend on 
the presence of the caloric reward. We profiled all imaged neu- 
rons based on their differences between the average Ca^"^ activ- 
ity 1 .5 s before and after each behavioral event to quantify their 
responses to aspects of feeding. LH GABAergic neurons dis- 
played a variety of classifiable response profiles, such as cells 
that were modulated during reward consumption or immedi- 
ately following active nose pokes (Figures 6C and 6D; Figures 
S7B-S7E). Thus, we categorized these neurons as responsive 
to a particular behavioral event (consumption or nose poke) if 
their Ca^"^ activity (averaged from 0 to 1 .5 s after the event) 
was statistically enhanced compared to the activity -1.5 to 
0 s before the event. Averaging responses to consummatory 
licks revealed separate subpopulations of neurons with distinct 
Ca^"^ responses (Figure 6E). A much larger population of LH 
GABAergic neurons displayed significant Ca^"^ signals time- 



locked to appetitive nose pokes (Figure 6F), although the 
amplitudes of these responses were lower compared to 
consummatory responses. LH GABAergic neurons displayed a 
diverse range of responses to consummatory licks and nose 
pokes (Figures S7F and S7G). However, we found that individ- 
ual neurons that significantly responded to reward consumption 
were largely separate from those that respond to appetitive 
nose pokes (Figures 6G and 6H). Taken together, these data 
demonstrate that functionally segregated subpopulations of 
LH GABAergic neurons encode aspects of appetitive (nose- 
poke-responsive cells) or consummatory (lick-responsive cells) 
behaviors, but rarely both. 

Lastly, we explored the Ca^"^ dynamics of the same neurons 
between both behavioral paradigms (free-access feeding and 
the PR3 task) to determine if the same neurons that respond to 
the presence of food during the free-access feeding task also 
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Figure 6. Separate LH GABAergic Neurons Selectively Encode Aspects of Appetitive or Consummatory Behaviors 

(A) response to consummatory licks. (Top) Ca^"^ response to individual consummatory iicks from an exampie ceii. (Bottom) Average Ca^"^ response to aii 
consummatory licks from the example cell. 

(B) Ca^"^ response to nose pokes. (Top) Ca^"^ response to individual nose pokes from an example cell. (Bottom) Average Ca^"^ response to all nose pokes from the 
example cell. 

(C and D) Cell activity maps from an example animal. Color codes consummatory lick or appetitive nose-poke responses (average difference between Ca^"^ 
signals from 1.5 s before and after the respective event) for each cell. Scale bars, 1 00 nm. 

(E) Average Ca^"^ response to consummatory licks following reward delivery from all excited cells from all animals. (Top) Average Ca^"^ response to consummatory 
licks following reward delivery from all cells. (Bottom) Ca^"^ response to consummatory licks averaged across cells excited by consummatory licks (n = 75 lick 
excited; n = 743 total cells). 

(F) Average Ca^"^ response to nose pokes from all excited cells of all animals. (Top) Average Ca^* response to nose pokes from all cells. (Bottom) Ca^* response to 
nose pokes averaged across cells excited by nose pokes (n = 168 nose poke excited; n = 743 total cells). 

(G) Cell map from example animal. Cells excited by consummatory licks (cyan), nose pokes (yellow), or both (green). Scale bar, 100 pm. 

(H) Venn diagram representing distribution and overlap of classified responsive cells in the PR3 task. 

Green-shaded regions represent SEM. See also Figure S7 and Movie S3. 



responded to consummatory licks and/or nose pokes during the 
PR3 task. To accomplish this, we registered iC unit ceii maps 
from the same animal between sessions (Figures 7A-7C) and 
appiied a 5 pm cutoff between center points of paired celis 
based on the distribution of ceiis from different imaging session 
(Figure 7D) and the distance between ceiis from the same ses- 
sions (Figure 7E). Using these criteria, oniy a subset of neurons 
imaged during the free-access feeding behaviorai task were 
also detected in the PR3 task (n = 40 PR3 excited ceiis out of 
125 FZ-responsive ceils from paired sessions. Totai paired ceiis 
between recording sessions: 472; Figure 7F). Neurons tracked in 
both sessions were not iocaiized to any portion of the fieid of 
view and showed no distinct anatomicai pattern or iayout (Fig- 
ures 7G-7I). Flowever, paired ceiis dispiayed a high degree of 
functionai overlap in the schematized IC unit ceii maps between 
distinct imaging sessions. A iarge proportion of FZe neurons that 



aiso respond to PR3 behaviorai events (28/40 ceiis) dispiayed 
increased activity to either nose pokes or consummatory iicks 
in the PR3 task, whereas a smalier portion of FZi neurons (12/ 
40 ceils) responded to PR3 behavioral events (Figure 7J). Taken 
together, these data reveal that subsets of LFI GABAergic neu- 
rons functionally overlap between tasks that require different 
behavioral processes to obtain food, signifying the flexibility 
and complexity of these neurons for integrating and regulating 
components of feeding. 

DISCUSSION 

Historically, the LH has been viewed as a critical governor of both 
appetitive and consummatory behaviors (Hoebel and Teitel- 
baum, 1962; Margules and Olds, 1962). However, given that 
the LH encompasses a plethora of genetically, anatomically. 
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Figure 7. Tracking the Activity Dynamics of Individual LH GABAergic Neurons across Separate Behavioral Tasks 

(A) Cell map from example animal during free-access feeding task. 

(B) Cell map from same example animal during PR3 appetitive task. 

(C) Merge of free-access feeding and PR3 appetitive task cell maps from same example animal. 

(D) Distribution of nearest-neighbor distances between cells of different behavioral tasks but within the same animal for all subjects. 

(E) Distribution of nearest-neighbor distances between all cells within the same behavioral task and imaging session. 

(F) Distribution of cell responses in free-access feeding task of only paired cells (n = 472 cells from 6 mice; n = 73 FZe cells; n = 52 FZi cells). 
(G-l) Maps of only paired cells from an example animal across the free-access feeding and PR3 tasks. Scale bars, 100 pm. 

(J) Bar graph showing cells that respond in both the free-access feeding and PR3 tasks. 



and functionally distinct neurons that utilize various neurotrans- 
mitters and neuropeptides, the precise mechanisms by which 
these cell populations orchestrate behavior have remained a 
mystery. The results described here demonstrate that Vgat- 
expressing neurons in the LH function to promote both 
appetitive and consummatory behaviors. Importantly, we show 
that these 1/gaf-targeted cells in the LH do not colocalize with 
either of the neuropeptides, MCH, or Orx, in agreement with 
previous findings that certain LH GABAergic subpopulations 
(GAD65-expressing neurons) are neurochemically and electro- 
physiologically distinct from MCH and Orx cells (Karnani et al., 
2013). However, it has also been reported that some MCH 
neurons do express GABAergic markers, such as GAD67, and 
release GABA from distal synaptic terminals (Jego et al., 
2013). Thus, transgene penetrance in Vgat-IRES-Cre mice may 
be reduced compared to endogenous LH Vgat expression 
and/or MCH neurons may express extremely low levels of 
Vgat, which can be dynamically regulated as observed in other 
GABAergic neuronal populations (Jarvie and Hentges, 2012; 
Lamas et al., 2001; Sperk et al., 2003). In addition, the present 
results do not rule out the possibility that these unique 1/gaf- 
targeted GABAergic neurons contain other neuropeptides, 
such as Neurotensin or Galanin (Allen and Cechetto, 1995; 
Goforth et al., 2014; Laque et al., 2013). Thus, parceling out pre- 
cise neuronal subtypes from these multidimensional populations 
in the LH to then elucidate their function still remains a major 
challenge. 



Divergent circuit connectivity between LH neurons and up- 
stream and downstream circuit nodes could also account for 
the diverse appetitive and consummatory response profiles of 
the LH GABAergic neuronal population. Previous retrograde 
and anterograde tracing studies demonstrate strong connectiv- 
ity between the LH and other feeding- and reward-associated 
brain regions that are also important for motivated behaviors, 
including other hypothalamic nuclei, midbrain, hindbrain, and 
striatal structures (Adamantidis et al., 2011; Betley et al., 2013; 
Gutierrez et al., 2011; Hahn and Swanson, 2012; Kempadoo 
et al., 2013; Leinninger et al., 201 1), implying that the selective- 
coding properties within separate LH GABAergic subpopula- 
tions might be input and/or projection dependent. 

While the LH controls both appetitive and consummatory 
behaviors, these processes are encoded by distinct cellular 
LH GABAergic subpopulations. Our data suggest that separate 
subsets of appetitive-coding and consumption-coding ensem- 
bles exist within the LH GABAergic network. Thus, the LH 
GABAergic network can be viewed as a mosaic of functionally 
and computationally distinct cell types, requiring further 
definition. Nevertheless, these important computational differ- 
ences among individual LH GABAergic neurons would have 
gone unnoticed if only bulk neuromodulatory approaches 
were employed, further underscoring the necessity of identifying 
the natural activity dynamics within a network to better under- 
stand the precise neural underpinnings of complex behavioral 
states. 
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EXPERIMENTAL PROCEDURES 
Experimental Subjects 

All procedures were conducted in accordance with the Guide for the Care and 
Use of Laboratory Animals, as adopted by the NIH, and with approval of the 
Institutional Animal Care and Use Committee at the University of North 
Carolina at Chapel Hill (UNC). Adult {25-30 g) male Vgat-IRES-Cre (Vong 
et al., 2011) or wild-type littermate mice were used. 

Behavioral Assays 

All food-deprived mice were restricted to 85% to 90% of their initial body weight 
by administering one daily feeding of ~2.5 to 3.0 g of standard grain-based 
chow immediately following each behavioral experiment, if performed. Animals 
were run on free-access feeding, real-time place preference, optical self-stim- 
ulation, and operant feeding (FR1 and PR3) assays. Foroptogenetic manipula- 
tions, light from solid-state lasers (473 or 532 nm) was delivered via custom- 
made patch cables to implanted chronic fibers on animals’ head as described 
previously (Jennings et al., 2013b). For each chemogenetic behavioral manip- 
ulation, mice received i.p. injections of either vehicle or CNO (1 mg/kg) 45 min 
priorto testing with at least 3 days in between each session (counterbalanced). 
For further details, refer to the Extended Experimental Procedures. 

Freely Moving Imaging 

A miniature microscope with an integrated LED (475 nm) was used to image 
GCaMP6m fluorescence in LH GABAergic neurons through an implanted 
microendoscope. Using nVista HD Acquisition Software (v. 2; Inscopix), 
images were acquired at 15 frames per second with the LED transmitting 
0.1 to 0.2 mW of light on average. Ca^"^ imaging was synchronized with 
time-stamped behavioral data at the start of each session. All images were 
processed using the Mosaic software (v. 1.0.0b; Inscopix) and then analyzed 
with custom MATLAB scripts. For further details, refer to the Extended Exper- 
imental Procedures. 

Statistical Analysis 

Mean values are accompanied by SEM values. Comparisons were tested using 
paired or unpaired t tests. Two-way ANOVA tests followed by Bonferroni post 
hoc comparisons were applied for comparisons with more than two groups. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, seven 
figures, and three movies and can be found with this article online at http://dx. 
doi.org/10.1016/j.cell.2014.12.026. 
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SUMMARY 

The lateral hypothalamic (LH) projection to the ventral 
tegmental area (VTA) has been linked to reward 
processing, but the computations within the LH- 
VTA loop that give rise to specific aspects of 
behavior have been difficult to isolate. We show 
that LH-VTA neurons encode the learned action of 
seeking a reward, independent of reward availability. 
In contrast, LH neurons downstream of VTA encode 
reward-predictive cues and unexpected reward omis- 
sion. We show that inhibiting the LH-VTA pathway 
reduces “compulsive” sucrose seeking but not food 
consumption in hungry mice. We reveal that the LH 
sends excitatory and inhibitory input onto VTA dopa- 
mine (DA) and GABA neurons, and that the GABAergic 
projection drives feeding-related behavior. Our study 
overlays information about the type, function, and 
connectivity of LH neurons and identifies a neural cir- 
cuit that selectively controls compulsive sugar con- 
sumption, without preventing feeding necessary for 
survival, providing a potential target for therapeutic in- 
terventions for compulsive-overeating disorder. 

INTRODUCTION 

Tremendous heterogeneity exists across laterai hypothalamic 
(LH) neurons in terms of function and connectivity, and this 
can be observed by the variety of behaviors related to reward, 
motivation, and feeding linked with this region. However, little 
is known about how the LH computes specific aspects of 
reward processing and how this information is relayed to down- 
stream targets. Electrical stimulation of the LH produces intra- 
cranial self-stimulation (ICSS) (Olds and Milner, 1954), as well 
as grooming, sexual, and gnawing behaviors (Singh et al., 
1996). LH neurons encode sensory stimuli (Norgren, 1970; Ya- 
mamoto et al., 1989), including reward-associated cues (Naka- 
mura et al., 1987). LH neurons also fire during both feeding 
(Burton et al., 1976; Schwartzbaum, 1988) and drinking (Tabu- 
chi et al., 2002). However, making sense of the remarkable 
functional heterogeneity observed in the LH has been a major 
challenge in the field. 
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Although the LH is interconnected with many subcortical 
regions, we have a poor understanding of how the functional 
and cellular heterogeneity of the LH is transposed upon these 
anatomical connections. One LH projection target of interest is 
the ventral tegmental area (VTA), a critical component in reward 
processing (Wise, 2004). The LH-VTA projection was explored in 
early studies that used electrophysiological recordings com- 
bined with antidromic stimulation (Bielajew and Shizgal, 1986; 
Gratton and Wise, 1988). It has since been confirmed, using a 
rabies-virus-mediated tracing approach, that there is monosyn- 
aptic input from LH neurons onto dopamine (DA) neurons in the 
VTA (Watabe-Uchida et al., 201 2). The VTA also sends reciprocal 
projections back to the LH, both directly and indirectly via other 
regions such as the nucleus accumbens, amygdala, hippocam- 
pus, and ventral pallidum (Barone et al., 1981 ; Beckstead et al., 
1979; Simon et al., 1979). 

Although both electrical (Bielajew and Shizgal, 1986) and 
optical (Kempadoo et al., 2013) stimulation have established 
a causal role for the LH projection to the VTA in ICSS, several 
questions remain to be answered. First, what is the neural 
response of LH-VTA neurons to different aspects of reward- 
related behaviors? Second, what is the role of the LH-VTA 
projection in reward seeking under different reinforcement 
contingencies? Third, what is the overall composition of fast 
transmission mediated by LH inputs to the VTA, and which 
VTA cells receive excitatory/inhibitory input? Finally, what 
do the excitatory and inhibitory components of the LH-VTA 
pathway each contribute toward orchestrating the pursuit of 
appetitive reward? 

To address these questions, we recorded from LH neurons in 
freely moving mice and used optogenetic-mediated photoiden- 
tification to overlay information about the naturally occurring 
neural computations during reward processing upon information 
about the connectivity of LH neurons. In addition, we used 
ex vivo patch-clamp experiments to explore the composition 
of GABAergic and glutamatergic LH inputs onto both DA and 
GABA neurons within the VTA. Building on our results from the 
recordings experiments, we utilized behavioral tasks to establish 
causal relationships between aspects of both reward seeking 
and feeding and the activation of distinct subsets of LH-VTA pro- 
jections. Together, these data help us establish a model for how 
the components within the LH-VTA loop work together to pro- 
cess reward and how manipulating individual components can 
have profound effects on behavior. 
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RESULTS 

Photoidentification of Distinct Components in the 
LH-VTA Circuit 

In order to identify LH neurons that provide monosynaptic input 
to the VTA in vivo and observe their activity during freely moving 
behaviors, we used a duai-virus strategy to seiectively express 
channelrhodopsin-2 (ChR2) in LH neurons providing monosyn- 
aptic input to the VTA (Figures 1A and S1). We injected an 
adeno-associated virai vector (AAV5) carrying ChR2-eYFP in 
a Cre-recombinase-dependent doubie-inverted open reading 
frame (DIO) construct into the LH to infect iocai somata and 
injected a retrogradeiy traveiing herpes simpiex virus (HSV) car- 
rying Cre-recombinase into the VTA. Subsequent recombination 
permitted opsin and fiuorophore expression seiectiveiy in LH 
neurons providing monosynaptic input to the VTA. To confirm 
our approach, we performed ex vivo whoie-ceii patch-ciamp 
recordings in horizontai brain siices containing the LH and 
recorded from neurons expressing ChR2-eYFP, as weii as 
neighboring LH neurons that were ChR2-eYFP negative (Fig- 
ure 1 B). Light-evoked spike iatencies, measured from iight-puise 
onset to the peak of the action potentiai, ranged from 3-8 ms 
(Figure 1C). We also found that none of the non-expressing 
(ChR2-negative) ceiis recorded showed excitatory responses 
to photostimuiation (n = 14; Figure 1C), despite their proximity 
to ChR2-expressing ceils. 

In order to perform optogenetically mediated photoidentifica- 
tion in vivo, an optrode was impianted into the LH to record 
neuronai activity during a sucrose-seeking task. In the same 
recording session, we provided severai patterns of photostimu- 
iation to identify ChR2-expressing LH-VTA neurons (Figures ID 
and SI). We examined the distribution of excitatory photores- 
ponse iatencies across aii LH neurons dispiaying a time-iocked 
change in firing rate in response to iiiumination and observed a 
bimodai distribution (Figure IE). We observed a popuiation of 
neurons during in vivo recordings with iatencies in a range of 
3-8 ms. This was identicai to the iatency range found in ChR2- 
expressing LH-VTA neurons when we recorded ex vivo. We 
termed these units “Type 1” units (Figures 1C, IE, and IF), in 
addition, there was a distinct popuiation of ceiis with ~100 ms 
photoresponse iatencies (Figures IE and 1G), and we termed 
these “Type 2” units. We aiso observed neurons that were in- 
hibited in response to photostimuiation of LH-VTA neurons (Fig- 
ure S2), and we termed these “Type 3” units. We compared the 
action potentiai duration (as measured from peak to trough) and 
mean firing rates of Type 1 and Type 2 units as weii as those that 
did not show a photoresponse (Figure 1H). The distribution of 
action potentiai durations of Type 1 (Figure 1 1) and Type 2 (Fig- 
ure 1 J) units shows that the majority of Type 1 units have an ac- 
tion potential duration iessthan 500 ps (84%; n = 16/19, binomiai 
distribution, p = 0.002). 

Aithough Type 1 units fit standard criteria to be ciassified as 
ChR2 expressing (Cohen et al., 2012; Zhang et ai., 2013), it 
was unclear whether the longer latency photoresponse of Type 
2 units was indicative of ChR2-expressing neurons that re- 
sponded more siowiy to photostimuiation, or whether this effect 
was due to network activity. Given that the ChR2-expressing 
(Type 1) LH neurons project directiy to the VTA, one possibility 



was that Type 2 neurons were receiving feedback from the 
VTA (Figure IK). Another possibility was that Type 2 neurons 
were activated by axon coliaterais from Type 1 neurons (Fig- 
ure 1L). To differentiate between these two possibie circuit 
modeis, we inhibited the VTA in conjunction with photoidentifica- 
tion in the LH. 

Long Latency Photoresponses in LH Neurons Are 
Mediated by Feedback from the VTA 

Based on our circuit modeis, we wouid expect distai inhibition to 
have no effect on the photoresponses of ChR2-expressing LH 
neurons. However, if photoresponsive, but non-expressing, LH 
neurons reiied on feedback from the VTA to elicit a time-iocked 
response to iiiumination (Figure 1 K), we wouid expect an attenu- 
ation of photoresponses in these neurons upon VTA inhibition. 
We expressed ChR2 in LH-VTA ceiis as above, but this time 
aiso expressed enhanced haiorhodopsin 3.0 (NpHR) in the VTA 
and impianted an optic fiber in the VTA in addition to the optrode 
in LH (Figure 2A). We deiivered the same biue-iight iiiumination 
patterns in the LH for aii three epochs but aiso photoinhibited 
the VTA with yeliow iight in the second epoch (Figure 2A). 

The photoresponses of Type 1 units to biue-light iiiumination in 
the LH were unaffected by photoinhibition of the VTA, which is 
consistent with ChR2 expression in Type 1 LH-VTA neurons (Fig- 
ure 2B). In contrast, the majority of Type 2 units (87%; n = 13/15, 
binomiai distribution, p = 0.004) showed a significant attenuation 
of photoresponses to biue-light pulses delivered in the LH upon 
photoinhibition of VTA neurons. The responses of Type 1 and 
Type 2 units during VTA photoinhibition were significantly 
different (chi-square = 7.64, p = 0.0057; Figures 2B and 2C). 
These differences can also be seen in the max Z scores during 
individual epochs (Figure 2D) and with the yellow-ON epoch 
normalized to the yellow-OFF epoch (Figure 2E). These data 
suggest that Type 2 LH neurons receive input (either directly or 
indirectly) from the VTA (Figure 1 K) rather than via local axon 
collaterals (Figure 1L). 

Distinct Encoding Properties of LH Neurons Either 
upstream or downstream of the VTA 

Having identified these two distinct types of LH neurons in the 
LH-VTA loop, we wanted to examine naturally occurring neural 
activity during a sucrose self-administration task (Figure 3A). 
Mice were trained to perform nosepoke responses for a cue pre- 
dicting sucrose delivery at an adjacent port (as in Tye et al., 
2008). To allow us to differentiate neural responses to the nose- 
poke and the cue, the cue and sucrose were delivered on a par- 
tial reinforcement schedule, wherein 50% of nosepokes were 
paired with a cue and sucrose delivery. 

Type 1 units showed phasic responses to sucrose port entry, 
as seen in a representative Type 1 unit (Figure 3B), as well as the 
population data for all Type 1 units (Figure 3C). The phasic re- 
sponses of Type 2 units, however, mainly reflected responses 
to the reward-predictive cue (Figures 3D and 3E). The normalized 
firing patterns of all recorded neurons (n = 1 98, divided into Type 
1,2,3, and non-responsive units) are displayed for each task 
component: nosepokes paired with the cue, nosepokes in the 
absence of the cue, and sucrose port entry (Figure 3F). All 
Type 1 units that showed task-relevant phasic changes in activity 
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Figure 1. Phototagging LH-VTA Projections Reveals 
Two Populations of Neurons with Different Response 
Latencies to Photostimulation 

(A) Wild-type mice (n = 12) were injected with AAV 5 -DIO- 
ChR2-eYFP into the lateral hypothalamus (LH) and HSV- 
EFIa-IRES-Cre-mCherry into the ventral tegmental area 
(VTA). 

(B) Horizontal brain slices containing the LH were prepared 
for whole-cell patch-clamp recordings in ChR2-expressing 
and non-expressing LH neurons. 

(C) Individual traces recorded in current-clamp mode 
showing the response of ChR2-expressing (green, n = 10) 
and non-expressing (gray, n = 14) cells to a 5 ms pulse of 
473 nm light are shown. The box and whisker plot shows the 
average response latency for each ChR2-expressing cell 
ex vivo. 

(D) Photoresponse latencies in vivo were calculated by 
measuring the time from stimulation to 4 SD above the 
baseline firing rate. 

(E) A bimodal distribution of excitatory photoresponse la- 
tencies was identified in recorded units (n = 198) and divided 
into Type 1 (green; n = 19) and Type 2 units (blue; n = 34). 

(F) Type 1 units responded to photostimulation with fast 
excitation (3-8 ms latency). Inset shows the overlaid average 
traces for spontaneous spiking (black) and light-evoked 
spiking (blue) from a representative unit. 

(G) Type 2 units responded to photostimulation with delayed 
excitation (80-120 ms latency). 

(H) Scatterplot depicting the peak-trough duration of the 
waveform plotted against the average firing rate for each unit. 
(I and J) Normalized histogram showing the distribution of 
peak-trough durations for Type 1 units (I) and Type 2 units (J). 
(K and L) Diagrams illustrating two possible circuit models. (K) 
Type 1 units project directly from the LH to the VTA, whereas 
Type 2 units represent a population in the LH that is receiving 
feedback from the VTA; or (L) Type 2 units represent a pop- 
ulation in the LH that is receiving input from collaterals of Type 
1 units. Dotted lines indicate the presence of either a mono- 
synaptic or polysynaptic connection. 

Scale bar: y axis, 0.2 mV; x axis, 500 )is. See also Figure SI . 
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Figure 2. Inhibition of the VTA Selectively Attenu- 
ates the Photoresponse of Type 2, but not Type 1 , 
Units 

(A) Mice expressing ChR2 in LH-VTA projections received 
an additional injection of AAVs-CaMKIIa-eNpHRS.O- 
eYFP into the VTA to allow for transient inhibition of VTA 
neurons by yellow light. Three epochs of phototagging 
were conducted (LH photoactivation: ON-ON-ON, VTA 
photoinhibition: OFF-ON-OFF). 

(B) Type 1 (n = 6/121 units, n = 6 animals) photoresponse 
properties were unaffected (0%; n = 0/6 attenuated or 
abolished) by VTA inhibition. Inset circles represent the 
number of units photoresponsive during each epoch. 
Inset shows the overlaid average traces for spontaneous 
spiking (black) and light-evoked spiking (blue) from a 
representative unit. 

(C) Type 2 (n = 1 5/121 units, n = 6 animals) photoresponse 
properties were abolished (67%; n = 10/15) or attenuated 
(87%; n = 13/15) during NpHR-mediated VTA inhibition. 

(D) No significant difference in max Z score was detected 
between epochs with and without inhibition of the VTA for 
Type 1 units (two-tailed, paired Student’s t test, p = 0.71). 
The max Z score was significantly lower in the ON (LH 
blue light illumination + VTA photoinhibition) epoch rela- 
tive to the first OFF epoch (LH blue light illumination only) 
for Type 2 units (two-tailed, paired Student’s t test, **p = 
0.0015). 

(E) There was a significant difference in max Z score 
(normalized to the OFF epoch) during photoinhibition of 
the VTA between Type 1 units compared to Type 2 units 
(two-tailed, unpaired Student’s t test, *p = 0.014). 

Error bars indicate + SEM. Scale bar: y axis, 0.2 mV; 
X axis, 500 jis. See also Figure S3. 
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Figure 3. Type 1 Units Predominantly Respond to the Port Entry, whereas Type 2 Units Respond to Both the Conditioned Stimulus and the 
Port Entry 

(A) Mice with optrodes implanted in the LH and expressing ChR2 in LH-VTA projections were trained on a task where 50% of nosepokes (NP) were followed by a 
cue (conditioned stimulus; CS) that predicts the delivery of sucrose (unconditioned stimulus; US) at the delivery port. In vivo electrophysiological recordings were 
performed during the behavioral task followed by phototagging in the same recording session to identify units by projection target. 

(legend continued on next page) 
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(74% ; n = 1 4/1 9) were either phasically excited or inhibited by su- 
crose port entry, with a smaii number aiso showing phasic inhi- 
bition to the reward-predictive cue (Figures 3B, 3C, and 3G). in 
contrast, Type 2 units were more heterogeneous, with task- 
responsive neurons encoding the cue seiectiveiy (35%), the su- 
crose port-entry seiectiveiy (26%), or both the cue and port entry 
(12%; Figures 3D, 3E, and 3H). To iiiustrate the strength of re- 
sponses of Type 1 and Type 2 units to task-related events, we 
plotted each cell on a three-dimensional plot according to Z 
score (Figure 31). To show the distribution of phasic changes in 
firing to multiple task-related events on a qualitative level, we 
plotted the number of cells of each photoresponse type that 
fell into a given category (Figure 3J). 

Different Components of the LH-VTA Circuit Represent 
Distinct Aspects of Reward-Related Behavior 

Given the well-defined role of the VTA in reward-prediction 
error (e.g., the phasic reduction of DA neuron firing in response 
to the unexpected omission of a reward and the phasic 
excitation in response to unexpected reward delivery) (Schultz 
et al., 1997), we investigated whether LH neurons would 
encode the unexpected omission of a sucrose reward. To do 
this, we recorded the neural activity of photoresponsive neu- 
rons during the same cue-reward task in well-trained animals 
but randomly omitted 30% of sucrose deliveries following the 
cue (Figure 4A). 

The majority of Type 1 units (88%; n = 15/17, binomial distribu- 
tion, p = 0.001) were insensitive to reward omission (Figures 4B 
and 4D), whereas a large subset of Type 2 units (67%; n = 12/18) 
showed a significantly different response to reward-presented 
and reward-omitted trials (Figures 4C and 4D). We concluded 
that LH-VTA (Type 1) neurons encoded the action of entering 
the port, as these port-entry responses were persistent even 
upon reward omission (Figure 4D), in contrast to Type 2 units 
(chi-square = 10.9804, p = 0.0009). 

To determine whether Type 1 responses to port entry were 
truly encoding the conditioned response (CR), as opposed to 
general reward-seeking or exploratory behavior, we recorded 
in untrained mice that had not yet acquired the task. In task-naive 
mice, we delivered sucrose to the port in the absence of a pre- 
dictive cue (unpredicted reward delivery) and found that Type 
1 units did not show phasic responses to port entry (Figures 



4E, 4F, and 41), consistent with the model that Type 1 neurons 
encode the CR (Figure 4J). 

Next, to determine whether Type 2 unit activity is consistent 
with a reward-prediction error-like response profile, we also re- 
corded these neurons in well-trained animals during unpredicted 
reward delivery (Figure 4G). We found that a subset of Type 
2 units responded to unpredicted sucrose deliveries (50%; Fig- 
ures 4G-4I). Taken together, subsets of Type 2 units are sensitive 
to unexpected reward omission (Figures 4C and 4D) and unpre- 
dicted reward delivery (Figures 4G-4I), consistent with a reward- 
prediction error-like response profile. 

Photostimulation of the LH-VTA Pathway Promotes 
Sucrose Seeking in the Face of a Negative Consequence 

As we have shown above. Type 1 units represent a neural corre- 
late of CR. Importantly, the increase in firing rate begins prior to 
CR, ramping up until the CR has been completed (Figures 3B, 
3C, and 4B). To determine whether activation of the LH-VTA 
pathway could promote CR, we wanted to test the ability of LH- 
VTA activation in driving CR in the face of a negative conse- 
quence. In wild-type mice, we expressed ChR2-eYFP or eYFP 
alone in LH cell bodies and implanted an optic fiber over the 
VTA (Figures 5A and S4). Conversely, to test the role of the LH- 
VTA pathway in mediating CR or feeding-related behaviors, we 
bilaterally expressed NpHR-eYFP or eYFP alone in LH cells and 
implanted an optic fiber above the VTA (Figures 5A and S4). 

We designed a Pavlovian conditioning task in which food- 
deprived mice had to cross a shock grid to retrieve a sucrose 
reward (Figure 5B). In the first “baseline” epoch (with the shock 
grid off), we verified that each mouse had acquired the Pavlovian 
conditioned approach task. In the second (“Shock”) epoch, the 
shock grid delivered mild foot shocks every second. Finally, in 
the third epoch (“Shock+Light”), we continued to deliver foot 
shocks but also illuminated LH terminals in the VTA with blue 
light (10 Hz) in mice expressing ChR2 and matched eYFP con- 
trols and yellow light (constant) for mice expressing NpHR and 
their eYFP controls (Figure 5B). 

We observed a significantly higher number of port entries per 
cue during the Shock+Light epoch and a significantly higher dif- 
ference score (Shock+Light epoch - Shock-only epoch) in ChR2 
mice relative to eYFP mice (Figure 5C and Movie SI). In contrast, 
photoinhibition of the LH-VTA pathway resulted in a significant 



(B) Perievent raster histograms for a representative Type 1 unit that responded to port entry, but not to the reward-predictive cue. Inset shows overlaid average 
traces for spontaneous spiking (black) and light-evoked spiking (blue) from a representative unit. 

(C) Population Z score plots showing the average responses of all Type 1 units (n = 19/198 units, n = 12 animals). 

(D) Perievent raster histograms for a representative Type 2 unit that responded to the reward-predictive cue, but not to port entry. 

(E) Population Z score plots show the average responses of all Type 2 units (n = 34/1 98 units, n = 1 2 animals). 

(F) Heatmap representation of the individual Z scores of all units. 

(G) Of all Type 1 units, 63% responded exclusively to the port entry (n = 1 2/1 9), whereas 1 1 % responded to both the port entry and the reward-predictive cue (n = 
2/1 9). Within the Type 1 units that responded to the port entry, 64% (n = 9/1 4) were excited (red) upon port entry, whereas 36% (n = 5/1 4) were inhibited (blue), and 
within the units that responded to the reward-predictive cue, 100% (n = 2/2) were inhibited by the cue. 

(H) Of all Type 2 units, 35% (n = 1 2/34) responded exclusively to the reward-predictive cue, 26% (n = 9/34) responded exclusively to the port entry, and 12% (n = 4/ 
34) responded to both. Within the Type 2 units that responded to the cue, 1 00% (n = 1 6/1 6) were excited by the cue, whereas none were inhibited, and within the 
units that responded to port entry, 77% (n = 10/13) were inhibited upon port entry, whereas 23% (n = 3/13) were excited. 

(I) Graphical representation of Z scores during the experimental windows for cue, no cue, and port entry for Type 1 , Type 2, and "no photoresponse” units. 

(J) Diagram of recorded units demonstrating whether they responded to the cue or port entry (PE) and whether that response was with excitation (+) or 
inhibition (— ). 

Error bars indicate + SEM. Scale bar: y axis, 0.2 mV; x axis, 500 ^is. See also Figure S2. 
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Figure 4. LH-VTA Neurons Encode the CR of Sucrose Seeking 

(A) The original partial reinforcement sucrose self-administration task was 
modified so that in 30% of trials during which the reward-predictive cue was 
present, the expected sucrose delivery was omitted (15% of all trials). 

(B) Perievent raster histograms of a Type 1 unit that showed no difference in 
response to port entry with reward omission. Inset shows overlaid average 
traces for spontaneous spiking (black) and light-evoked spiking (blue) from a 
representative unit. 

(C) Perievent raster histograms of a Type 2 unit that showed a significantly 
different response to port entry upon omission of the expected reward. 

(D) Of all Type 1 units recorded (n = 17/122 units, n = 6 animals), only 12% (n = 
2/17) showed a significant difference in their responses when the expected 
reward was omitted. In contrast, of all Type 2 units recorded (n = 18/122 units, 
n = 6 animals), 67% (n = 12/18) showed a significant difference in their 
responses when the expected reward was omitted (chi-square = 10.9804, 
***p = 0.0009). 

(E) Unexpected sucrose delivery occurred in the absence of predictive cues. 
Perievent raster histogram of a Type 1 unit that did not respond to port entry 
following unpredicted reward delivery is shown. 

(F) Population Z score plot showing the average responses of all Type 1 units to 
the port entry following unpredicted reward delivery. 

(G) Perievent raster histogram of a Type 2 unit that showed an increase in firing 
rate to port entry following unpredicted reward delivery. 

(H) Population Z score plot of Type 2 unit responses to port entry following 
unpredicted reward delivery, separated into those that showed a significant 
response and those that showed no significant response. 

(I) Of all Type 1 units recorded (n = 8/105 units, n = 6 animals), 0% (n = 0/8) 
showed a significant response to the port entry following unpredicted reward 
delivery. In contrast, of all Type 2 units recorded (n = 16/105 units, n = 6 ani- 
mals), 50% (n = 8/1 6) showed a significant response to the port entry following 
unpredicted reward delivery (chi-square = 6, *p = 0.0143). 

(J) Schematic of the LH-VTA loop and the components of reward processing 
encoded by Type 1 and 2 cells. CR = conditioned response; CS = conditioned 
stimulus; US = unconditioned stimulus. 

Scale bar: y axis, 0.2 mV; x axis, 500 |.is. 



reduction In port entries per cue and difference scores in the 
NpHR mice relative to eYFP mice (Figure 5D and Movie S2). 
Within-session extinction experiments during which cue presen- 
tations were not followed by sucrose deliveries showed similar 
trends in effect (Figure S4). 

Importantly, we wanted to determine whether the changes in 
sucrose seeking we had obtained were caused by changes in 
feeding-related behavior or sensitivity to pain. We observed 
that photoactivation of the LH-VTA projection significantly 
increased the time spent feeding in well-fed mice in the ChR2 
group (Figure 5E). However, photoinhibition of the LH-VTA 
pathway did not significantly reduce feeding (Figure 5F), even 
though these animals were food deprived to enhance our ability 
to detect a reduction relative to the baseline epoch (compare to 
sated animals in Figure 5E). In neither the ChR2 (Figure 5G) nor 
NpHR group (Figure 5H) did we observe a difference in latency 
to tail withdrawal from hot water (Ben-Bassat etal., 1959; Grotto 
and Sulman, 1 967), indicating that manipulating the LH-VTA pro- 
jection was not altering analgesia. 

LH Provides Both Glutamatergic and GABAergic Input 
onto VTA DA and GABA Neurons 

To study the composition of the fast transmission components 
of LH inputs to the VTA that were eliciting these effects, we 
performed whole-cell patch-clamp recordings from VTA neu- 
rons in an acute slice preparation while optically activating LH 
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Figure 5. Excitation of LH-VTA Projections Promotes, whereas In- 
hibition Attenuates, Compulsive Sucrose Seeking 

(A) Mice received injections of AAV5-CaMKlla-ChR2-eYFP (n = 8), AAV5- 
CaMKIIa-eNpHR3.0-eYFP (n = 14), or AAVs-CaMKIIa-eYFP (n = 6 controls for 
ChR2, n = 8 controls for NpHR) into the LH, and an optic fiber was implanted 
above the VTA. 

(B) Mice were trained on a Pavlovian conditioned approach task wherein a cue 
predicted sucrose delivery to a port located across a shock grid. On test day, 
mice were presented with 20 cues during a baseline period without shock, 20 
cues when the shock grid was on, and 20 cues during which 10 Hz blue or 
constant yellow light was delivered while the shock floor remained on. 

(C) Mice in the ChR2 group showed a significant increase in the number of port 
entries per cue during the “Shock+Light” epoch relative to eYFP controls (n = 8 
ChR2, n = 6 eYFP; two-way ANOVA revealed a group x epoch interaction, 
F2.24 = 20.47, p < 0.0001; Bonferroni post-hoc analysis, *p < 0.05). The dif- 
ference between the number of port entries per cue during the “Shock+Light” 
epoch and "Shock” epoch was also significantly different between the ChR2 
and eYFP control groups (two-tailed, unpaired Student’s t test, **p = 0.0090). 

(D) Mice in the NpHR group showed a significant decrease in the number of 
port entries per cue during the Shock+Light epoch relative to eYFP controls 
(n = 13 NpHR, n = 8 eYFP; two-way ANOVA revealed a group x epoch inter- 
action, p2,38 =11 6.63, p < 0.0001 ; Bonferroni post-hoc analysis, *p < 0.05). The 
difference score was also significantly different between the NpHR-expressing 
and eYFP control mice (two-tailed, unpaired Student’s t test, **p = 0.0062). 

(E) Mice were placed into an open chamber with two cups, one containing food 
and the other without, and behavior in three experimental epochs was re- 
corded (light OFF-ON-OFF). ChR2-expressing mice showed a significant in- 
crease in feeding (measured by time spent consuming food) compared with 
eYFP controls during the epoch paired with blue-light stimulation (n = 8 ChR2, 
n = 6 eYFP; two-way ANOVA revealed a group x epoch interaction, p2,24 = 
4.23, p = 0.0268; Bonferroni post-hoc analysis, **p < 0.01). 

(F) NpHR-expressing mice showed no significant differences from eYFP con- 
trol mice in time spent feeding in any of the epochs (n = 9 NpHR, n = 7 eYFP). 
(G and H) To examine the effect of light stimulation on analgesia, mice had their 
tails placed into a heated water bath, and the latency-to-tail withdrawal was 
measured during two counterbalanced epochs (light ON-OFF). (G) ChR2-ex- 
pressing mice showed no significant difference in tail-withdrawal latency 
(normalized to OFF epoch) during blue-light stimulation compared to eYFP 
controls (n = 8 ChR2, n = 6 eYFP), (H) nor did NpHR-expressing mice during 
yellow-light stimulation (n = 5 NpHR, n = 8 eYFP). 

Error bars indicate ± SEM. See also Figure S4. 



inputs expressing ChR2-eYFP (Figures 6A and S5). Given 
that there is weii-estabiished heterogeneity within the VTA, 
inciuding ~65% DA neurons, ^30% GABA neurons, and ~E>% 
giutamate neurons (Margoiis et ai., 2006; Nair-Roberts et ai., 
2008; Yamaguchi et ai., 2007), we fiiied ceiis with biocytin whiie 
recording to ailow for identification of ceii type using post-hoc 
immunohistochemistry for tyrosine hydroxyiase (TH; Figure 6B), 
in addition to recording the hyperpoiarization-activated cation 
current (ih) and mapping celi iocation (Figures 6B and S5). 

First, we recorded in current-ciamp during photostimuiation of 
ChR2-expressing LH inputs and observed that 23 of 27 neurons 
showed a time-iocked response to photostimuiation of LH inputs 
(Figure 6C). The majority of DA neurons sampled in the VTA 
received a net excitatory input from the LH (56%), whereas 
another subset showed net inhibition (30%; Figure 6C). The 
spatial distribution of these DA neurons is mapped onto an atlas 
for horizontal slices containing the VTA (Figure 6D). 

To establish the monosynaptic contribution of LH inputs to 
VTA DA neurons, we used ChR2-assisted circuit mapping, 
where voltage-clamp recordings were performed in the pres- 
ence of tetrodotoxin (TTX) and 4-aminopyridine (4AP; Petreanu 
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Figure 6. The LH Sends a Mixture of Excit- 
atory and Inhibitory Projections to Both DA 
and GABA Neurons in the VTA 

(A) AAV5-CaMKIIa-ChR2-eYFP was injected into 
the LH, and at least 6 weeks later, 300 pm-thick 
horizontal brain slices were prepared containing 
the VTA. Whole-cell patch-clamp recordings were 
made in VTA neurons, and ChR2-expressing LH 
terminals were activated by illumination with 
473 nm light via an optic fiber resting on the brain 
slice. 

(B) Neurons were filled with biocytin during 
recording, and DA neurons were identified by 
immunohistochemistry for TH (n = 27). 

(C) The net effect of optical stimulation of LH ter- 
minals was assessed in current-clamp mode, 
which revealed that 55% of DA neurons (n = 1 5/27) 
showed a net excitatory response, whereas 30% 
(n = 8/27) responded with net inhibition, and 15% 
(n = 4/27) showed no response. An example of an 
excitatory postsynaptic potential (EPSP, red trace), 
an inhibitory postsynaptic potential (IPSP, blue 
trace), and a non-responsive cell (gray trace) are 
shown below each bar. 

(D) The distribution of all recorded TH"^ neurons 
plotted on horizontal midbrain slices with colors 
indicating the response to LH terminal photo- 
stimulation. 

(E) VTA DA neurons received only AMPAR-medi- 
ated input (67%, n = 6/9), only GABAAR-mediated 
input (1 1 %, n = 1/9), or both of these currents (22%, 
n = 2/9). 

(F) VTA GABA neurons were identified by the 
presence of mCherry (n = 24), achieved by injection 
of Cre-dependent AAVs-EFIa-DIO-mCherry into 
the VTA of VGAT::Cre mice. 

(G) Optical stimulation of LH terminals in current- 
clamp mode showed that GABA neurons respond 
with either net excitation (46%, n = 11/24) or net 
inhibition (54%, n = 13/24) to LH input. 

(H) The distribution of each recorded GABA neuron 
plotted on horizontal midbrain slices with colors 
indicating the response to LH terminal stimulation. 

(I) GABA neurons received a mixture of AMPAR- 
mediated and GABAAR-mediated input from the 
LH (AMPA only: 18%, n = 2/1 1; AMPA & GABAa: 
73%, n = 8/1 1 : GABAa: 9%, n = 1/1 1). 

MT = medial terminal nucleus of the accessory 
optic tract. See also Figures S5 and S6. 
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et al., 2007). Consistent with our observations from current- 
clamp recordings, we observed that the majority of recorded 
VTA DA neurons exclusively received excitatory monosynaptic 
input from the LH (67%), compared to VTA DA neurons that 
exclusively received inhibitory monosynaptic input (11%), or 
both (22%; Figures 6E and S6). 

We identified VTA GABA neurons by injecting a Cre- 
dependent fluorophore (AAVs-DIO-mCherry) into the VTA of 
VGAT::Cre mice and utilized mCherry expression to direct 
the recording of VTA GABA neurons (n = 24; Figure 6F). 
Forty-six percent of VTA GABA neurons responded with net 
excitation, whereas 54% responded with net inhibition, to 
photostimulation of ChR2-expressing LFI inputs (Figure 6G). 
The spatial distribution of these cells is shown in Figure 6FI. 
Upon examination of the monosynaptic input from the LH (as 
described above), we found that 18% of sampled GABA 
neurons received exclusively excitatory input and 9% received 
exclusively inhibitory input (Figure 61). However, relative to VTA 
DA neurons, we found that more VTA GABA neurons received 
both excitatory AMPAR-mediated and inhibitory GABAaR- 
mediated monosynaptic input from the LH (73%; chi-square = 
5.0505, p = 0.0246; Figures 61 and S6). 

Distinct Roles of Glutamatergic and GABAergic 
Components of the LH-VTA Pathway in Behavior 

Given that our ex vivo recordings provided evidence supporting 
robust input from both GABAergic and glutamatergic LH projec- 
tions to the VTA, we next probed the role of each component 
independently. To do this, we used transgenic mouse lines 
expressing Cre-recombinase in neurons that expressed either 
vesicular glutamate transporter 2 (VGLUT2) or vesicular GABA 
transporter (VGAT). We injected AAVs-DIO-ChR2-eYFP or 
AAVs-DIO-eYFP into the LH of VGLUT2::Cre and VGAT::Cre 
mice and implanted an optic fiber over the VTA (Figure S7). 
These animals were then run on each of the behavioral assays 
shown in Figure 5. 

We did not observe any detectable differences in the number 
of port entries made per cue between mice expressing ChR2 or 
eYFP in the LHS'“*-VTA projection (Figure 7 A) or in the LH°^®'^- 
VTA projection (Figure 7B). However upon video analysis, we 
noticed aberrant gnawing behaviors in the LH°'^®'°‘-VTA:ChR2 
group upon blue-light illumination (see Movies S3 and S4). In 
LH®'‘^*-VTA mice, although there was a trend toward a reduction 
in feeding upon photostimulation in the ChR2 group compared to 
the eYFP group, this was not statistically significant (Figure 7C). 
In contrast, we observed a robust increase in the time spent 
feeding in sated mice upon illumination in the LH°'^®'^-VTA:ChR2 
group relative to controls (Figure 7D and Movie S3). In neither 
group of animals was there an effect of light stimulation in the 
tail-withdrawal assay (Figures 7E and 7F). 

During the feeding task, as we did during the sucrose-seeking 
task, we again noticed aberrant feeding-related motor se- 
quences that were not directed at food. We filmed a repre- 
sentative mouse in the LH°'^®'^-VTA:ChR2 group in an empty 
transparent chamber, and upon 20 Hz photostimulation, we 
observed unusual appetitive motor sequences such as licking 
and gnawing the floor or empty space (Movie S4). We quantified 
these “gnawing” behaviors during the feeding task in the wild- 



type LH-VTA (Figure 7G), LH9'^*-VTA (Figure 7H), and LH°^®^- 
VTA (Figure 71) groups and showed that LH°'°‘®'^-VTA:ChR2 
mice gnawed more than wild-type or LH®'"*-VTA:ChR2 mice 
when photostimulated, as compared to their respective eYFP 
groups (Figure 7J). We considered whether the aberrant 
feeding-related behaviors might be separated from appropri- 
ately directed feeding at lower frequencies. However, when 
we tested the LH°'^®'^-VTA:ChR2 group with 5 Hz and 10 Hz 
trains of blue light, we observed a proportional relationship 
between stimulation frequency and both feeding and gnawing 
(Figure 7K). 

DISCUSSION 

Functional Components of the LH-VTA Loop 

The LH projection to the VTA has been explored with electrical 
stimulation collision studies (Bielajew and Shizgal, 1986) and 
has long been hypothesized to play a role in reward processing 
(Hoebel and Teitelbaum, 1962; Margules and Olds, 1962), yet 
pinpointing this role has been a challenge. Here, we are providing 
a detailed dissection of how individual components of the LH- 
VTA loop process different aspects of a reward-related task. 

Through the use of optogenetic-mediated phototagging (Fig- 
ure 1), we have identified two separate populations of LH neu- 
rons: cells that send projections to the VTA (Type 1) and cells 
that receive feedback from the VTA (Type 2; Figure 2)— though 
these populations need not be mutually exclusive, as it is 
possible that LH neurons could both send and receive inputs 
to and from the VTA. Interestingly, we found that relatively few 
photoresponsive neurons fell outside the bimodal distribution 
encapsulating these two populations (Figures S2B and IE). 
Given this, in combination with the long latency delay in Type 2 
photoresponses (~100 ms), we speculate that there may be 
one dominant pathway contributing to the activity of Type 2 neu- 
rons. Additionally, because DA binds G protein-coupled recep- 
tors, the kinetics are slower than most glutamatergic synapses 
(Girault and Greengard, 2004) and may explain this cluster of 
100 ms latency photoresponsive units. It is also possible that 
the VTA may provide indirect feedback through other distal re- 
gions, via excitatory intermediate regions such as the amygdala, 
or with disinhibition via the nucleus accumbens (NAc) or bed nu- 
cleus of the stria terminalis (BNST). 

Interestingly, whereas photostimulation of Type 1 units evokes 
excitatory responses in Type 2 units. Type 1 and 2 units show 
distinct behavioral encoding properties. For example, the 
numbers of Type 1 and Type 2 units that selectively encode 
the reward-predictive cue are significantly different (n = 0/19 
Type 1 versus n = 12/34 Type 2, chi-square = 8.67, p = 0.003). 
This paradoxical response pattern could be due to computa- 
tional processes at an intermediate circuit element, such as the 
VTA, that may be playing an active role during the behavioral 
task but inactive during phototagging. Additionally, the behav- 
ioral state of the animal could influence how these data are 
processed. 

Decoding Circuit Components in Reward Processing 

Our reward omission experiments allowed us to distinguish be- 
tween LH neural encoding of the CR and the consumption of 
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Figure 7. Photoactivation of the GABAergic, but Not the Gluta- 
matergic, Component of the LH-VTA Projection Increased Feeding 
Behaviors 

(A and B) In order to selectively activate glutamatergic or GABAergic LH-VTA 
projections, VGLUT2::Cre and VGAT::Cre mice received an injection of AAV5- 
DI0-ChR2-eYFP or AAVs-DIO-eYFP into the LH and had an optic fiber im- 
planted over the VTA. In the sucrose-seeking task, there were no significant 
differences in the numbers of port entries per cue in any epoch for 
VTA:ChR2 mice (n = 7) compared to LH^''^^-VTA:eYFP control mice (n = 6) (A) 
nor in those of LH°*®*-VTA:ChR2 mice (n = 6) compared to LH'^*®*-VTA:eYFP 
mice {n = 8) (B). 

(C) There was no significant difference between LH^'^^-VTA:ChR2 mice and 
eYFP controls in feeding behavior. 

(D) However, LH‘^^®^-VTA:ChR2 mice showed a significant increase in time 
spent feeding during light stimulation compared to LH'^^^^-VTAieYFP controls 
(two-way ANOVA revealed a group x epoch interaction, p2,24 = 4.78, p = 
0.0178; Bonferroni post-hoc analysis, **p < 0.01). 

(E and F) Neither LH9'^^-VTA:ChR2 mice (E) nor LH^*®*-VTA:ChR2 mice (F) 
showed a difference in tail withdrawal latency compared to their respective 
controls. 

(G) LH-VTA:ChR2 mice showed a significant increase in time spent gnawing 
during the light ON epoch compared to eYFP controls (two-way ANOVA re- 
vealed a group x epoch interaction, p2,24 = 4.78, p = 0.0179; Bonferroni post- 
hoc analysis, ***p < 0.001). 

(H) There was no significant difference between LH^'^-VTA:ChR2 and LH^'*^*- 
VTA:eYFP controls in gnawing behavior. 

(I) However, LH*^'^®^-VTA:ChR2 animals also showed a significant increase in 
time spent gnawing during the light ON epoch compared to LH'^'^^'^-VTAieYFP 
controls (two-way ANOVA revealed a group x epoch interaction, F2.24 = 1 8.91 , 
p < 0.0001 ; Bonferroni post-hoc analysis, ****p < 0.0001). 

(J) The difference score for gnawing behavior between the ON and OFF ep- 
ochs was significantly greater in LH‘^^®^-VTA:ChR2 animals in comparison 
with either wild-type LH-VTA:ChR2 or LH^'‘^‘-VTA:ChR2 animals (one-way 
ANOVA, F2 ,i 8 = 16.76, p < 0.0001 ; Bonferroni post-hoc analysis, ***p < 0.001). 

(K) Frequency-response curve showing the effect of different blue-light stim- 
ulation frequencies (OFF, 5 Hz, 10 Hz) on behavior in LH‘^*®*-VTA:ChR2 
animals. 

Error bars indicate ± SEM. See also Figure S7. 



the unconditioned stimulus (US). In these experiments, a subset 
of Type 2 units responded to the reward-predictive cue (CS) and 
the US and also showed a decrease in firing rate when expected 
rewards were omitted. Furthermore, a subset ofType2 units also 
show phasic excitation upon unexpected reward delivery (Fig- 
ures 4G and 4H). These data are reminiscent of the way DA neu- 
rons in the VTA encode reward-prediction error (Cohen et al., 
2012; Schultz et al., 1997). We speculate that VTA neurons 
may transmit reward-prediction error signals to a subset of LH 
neurons, which are well-positioned to integrate these signals 
for the determination of an appropriate behavioral output. 
Specifically, the LH is robustly interconnected with a multitude 
of other brain areas (Berthoud and Munzberg, 2011) and has 
been causally linked to homeostatic states such as sleep/arousal 
and hunger/satiety (Carter et al., 2009; Jennings et al., 2013). 

A Causal Role for the LH-VTA Pathway in Compulsive 
Sucrose Seeking? 

Compulsive reward-seeking behavior has primarily been dis- 
cussed in the context of drug addiction, wherein a classic para- 
digm for compulsive drug seeking has been to examine the de- 
gree to which drug-seeking behavior persists in the face of a 
negative consequence, such as a foot shock (Belin et al., 2008; 
Pelloux et al., 2007; Vanderschuren and Everitt, 2004). We 
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adapted this task for sucrose seeking to aiiow us to investigate 
whether activation of the LH-VTA pathway was sufficient to pro- 
mote compulsive sucrose seeking. Given that a distinct differ- 
ence between drug and natural reward is that drug rewards 
are not necessary for survival, there is controversy as to what be- 
haviors would constitute compulsive sucrose- or food-seeking 
behavior. An alternative interpretation of our data is that activa- 
tion of the LH-VTA pathway simply increases motivational drive 
or the urge to seek appetitive reinforcers. As the rates of obesity 
have increased in recent decades (Mietus-Snyder and Lustig, 
2008), compulsive overeating and sugar addiction are prevalent 
conditions that are a major threat to human health (Avena, 2007). 
The feeding behavior in sated (fully fed) mice after activation 
of the LH-VTA pathway is reminiscent of eating behaviors seen 
in humans diagnosed with compulsive overeating disorder (or 
binge-eating disorder) (DSM-V). 

It has been proposed that repeated actions lead to the for- 
mation of habits, which themselves lead to the compulsive 
reward seeking that characterizes addiction (Everitt and Rob- 
bins, 2005). Our finding that LH-VTA neurons only encode port 
entry after conditioning suggests that this pathway is selec- 
tively encoding a conditioned response, not just a motivated 
action. This is consistent with our observations that optically 
activating this projection can promote compulsive reward 
seeking in the face of a negative consequence (Figure 50), 
as well as in the absence of need (as seen in sated mice, Fig- 
ure 5E). This interpretation is further substantiated by our 
finding that photoinhibition of the LH-VTA pathway selectively 
reduces compulsive sucrose seeking (Figure 5D) but does 
not reduce feeding in food-restricted mice (Figure 5F). One 
of the greatest challenges in treating compulsive overeating 
or binge-eating disorders is the risk of impairing feeding be- 
haviors in general. From a translational perspective, we may 
have identified a specific neural circuit as a potential target 
for the development of therapeutic interventions for compul- 
sive overeating or sugar addiction without sacrificing natural 
feeding behaviors. 

Composition of LH Input to the VTA 

We show that in addition to a glutamatergic LH-VTA component 
(Kempadoo et al., 2013), there is also a significant GABAergic 
component in the projection (Leinninger et al., 2009), and that 
LH neurons synapse directly onto both DA and GABA neurons 
in the VTA (Figure 6). However, there is a difference in the bal- 
ance of the excitatory/inhibitory input onto VTA DA and GABA 
neurons. 

While we used immunohistochemical processing to verify the 
identity of VTA neurons, we also measured In, a hyperpolar- 
ization-activated inwardly rectifying non-specific cation current 
(Lacey et al., 1989; Ungless and Grace, 2012). The presence of 
this current has been widely used in electrophysiological studies 
to identify DA neurons, but it has been shown to be present only 
in subpopulations of DA neurons, delineated by projection target 
(Lammel et al., 2011). Although it has previously been proposed 
in a review by Fields and colleagues that “LH neurons synapse 
onto VTA projections to the RFC, but not those projecting to 
the NAc" (Fields et al., 2007), our data suggest that this contro- 
versy be reopened for further investigation. Even though we did 



observe a subset of DA neurons that received net excitation from 
the LH and possessed a very small 1^ (consistent with mPFC- or 
NAc medial shell-projecting DA neurons), we also observed a 
subset of DA neurons that received net excitatory input and 
showed a large If, (consistent with characteristics of DA neurons 
projecting to the lateral shell of the NAc; Figure S5; Lammel et al., 
2011). Conversely, VTA DA neurons that received a net inhibitory 
input showed a very small 1^ or lacked this current, which is 
consistent with the notion that the LH sends predominantly 
inhibitory input onto VTA DA neurons projecting to the mPFC 
or the medial shell of the NAc. We also show that LH inputs 
can be observed in both medial and lateral VTA, suggesting 
that the LH provides inputs onto VTA neurons with diverse pro- 
jection targets, as it is known that VTA projection target corre- 
sponds somewhat to spatial location along a medial-lateral 
axis (Lammel et al., 2008). 

Excitation/Inhibition Balance in the LH-VTA Pathway 

The role of the LH-VTA pathway in promoting reward has previ- 
ously been ascribed to glutamatergic transmission in the VTA 
(Kempadoo et al., 2013), as the CaMKIla promoter is often 
thought to be selective for excitatory projection neurons. How- 
ever, our data clearly show that expressing ChR2 under the con- 
trol of the CaMKIla promoter also targets GABAergic projection 
neurons in the LH (Figure 6). 

The behavior elicited by photostimulation of the LH°'^®'^-VTA 
pathway was frenzied, mis-directed, and maladaptive (Movie 
S4). One interpretation is that activation of the LH°'^®'^-VTA 
pathway sends a signal to the mouse that causes the recogni- 
tion of an appetitive reinforcer. An alternative interpretation is 
that the LH^'^^'^-VTA pathway might drive incentive salience 
or an intense “wanting,” consistent with a signal underlying 
conditioned approach, but at a non-physiological level that pro- 
duces this aberrant feeding-related behavior (Berridge and 
Robinson, 2003). Consistent with this, it is possible that activa- 
tion of the LH°'^®'^-VTA projection actually produces intense 
sensations of craving, or urges to feed. However, our experi- 
ments show that activation of LH°'^®'^-VTA does not produce 
an increase in compulsive sucrose seeking, but this is likely 
due to the excessive gnawing and aberrant appetitive behaviors 
focused on non-food objects in the testing chamber. Although it 
is difficult to determine the experience of the mouse during this 
manipulation, it is clear that appropriately directed feeding- 
related behaviors require the coordinated activation of both 
the GABAergic and glutamatergic components of the LH-VTA 
pathway. 

Conclusion 

Optogenetic and pharmacogenetic manipulations are powerful 
tools for establishing causal relationships, yet they do not reveal 
the endogenous, physiological properties of neural circuit ele- 
ments. Our study unifies information about the synaptic connec- 
tivity, the naturally occurring endogenous function, and the 
causal role of the LH-VTA pathway, providing a new level of 
insight toward how information is integrated in this circuit. These 
results highlight the importance of examining the functional role 
of neurons by connectivity, in addition to genetic markers. LH- 
VTA neurons selectively encoded the action of reward seeking 
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but did not encode environmental stimuli, whereas rewarding 
stimuli and reward-predictive cues were encoded by a discrete 
population of LH neurons downstream of the VTA. Furthermore, 
we have identified a specific projection that is causally linked to 
compulsive sucrose-seeking and feeding behavior. The hetero- 
geneity in the LH-VTA projection is necessary for providing an 
adaptive balance between driving motivation and regulating 
appropriately directed appetitive behaviors. These findings pro- 
vide insights relevant to pathological conditions such as compul- 
sive overeating disorder, sugar addiction, and obesity. 

EXPERIMENTAL PROCEDURES 
Phototagging VTA-Projecting LH Neurons 

To limit expression of ChR2 to only LH neurons projecting to the VTA, AAV5- 
DIO-ChR2-eYFP was injected into the LH and HSV-EF1a-IRES-Cre-mCherry 
into the VTA. In NpHR inhibition experiments, AAVs-CaMKIIa-eNpHRS.O- 
eYFP was injected into the VTA as well. An optrode was implanted in the LH 
and an optic fiber over the VTA. 

Partial Reinforcement Sucrose Retrieval Task 

For in vivo recording, animals were trained on a partial reinforcement sucrose 
retrieval task, where 50% of nosepokes were followed by a cue predicting the 
delivery of sucrose at the port entry. Adjustments were made to this task to 
examine the effects on reward omission by omitting sucrose deliveries from 
a subset of cues and to examine the effects on unexpected reward by the de- 
livery of sucrose without the existence of the cue. 

Sucrose Seeking in the Face of a Negative Consequence 

To study the effect on conditioned responding by stimulation of LH-VTA 
projections, we developed a task wherein an animal must cross a shock 
floor to obtain a sucrose reward. Wild-type animals with ChR2, NpHR, or 
eYFP injected either unilaterally (AAV5-CaMKIIa-ChR2-eYFP) or bilaterally 
(AAVs-CaMKIIa-eNpHRS.O-eYFP) in the LH with an optic fiber placed over 
VTA or VGLUT2::Cre and VGATiCre animals with AAV5-DIO-ChR2-eYFP 
injection in the LH and optic fiber over the VTA were tested. Because LH- 
VTA:ChR2 mice showed an increase in sucrose seeking in the face of a 
negative consequence, these animals were sated before evaluating the 
effects of photostimulation on feeding on normal chow. In contrast, LH- 
VTA:NpHR mice showed a decrease in sucrose seeking in the face of a 
negative consequence and were therefore mildly food restricted before 
testing the effects of photostimulation on feeding on normal chow. 

Ex Vivo Characterization of LH-VTA 

Whole-cell patch-clamp recordings were used to study the input of LH neurons 
onto DA and GABA VTA neurons. DA neurons were identified by filling cells 
with biocytin and post-hoc immunostaining forTH. GABA cells were identified 
during recordings by fluorescence dueto AAVs-DIO-mCherry injection into the 
VTA of VGAT::Cre animals. 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Discussion, Extended Experi- 
mental Procedures, seven figures, and four movies and can be found with 
this article online at http://dx.doi.Org/10.1016/j.cell.2015.01.003. 
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SUMMARY 

Excitatory amino acid transporters (EAATs) are 
essential for terminating glutamatergic synaptic 
transmission. They are not only coupled glutamate/ 
NaVHVK^ transporters but also function as anion- 
selective channels. EAAT anion channels regulate 
neuronal excitability, and gain-of-function mutations 
in these proteins result in ataxia and epilepsy. We 
have combined molecular dynamics simulations 
with fluorescence spectroscopy of the prokaryotic 
homolog GItph and patch-clamp recordings of 
mammalian E/\ATs to determine how these trans- 
porters conduct anions. Whereas outward- and 
inward-facing GItph conformations are nonconduc- 
tive, lateral movement of the glutamate transport 
domain from intermediate transporter conformations 
results in formation of an anion-selective conduction 
pathway. Fluorescence quenching of inserted tryp- 
tophan residues indicated the entry of anions into 
this pathway, and mutations of homologous pore- 
forming residues had analogous effects on GItph 
simulations and EAAT2/E/\AT4 measurements of sin- 
gle-channel currents and anion/cation selectivities. 
These findings provide a mechanistic framework of 
how neurotransmitter transporters can operate as 
anion-selective and ligand-gated ion channels. 

INTRODUCTION 

Secondary active glutamate transport by excitatory amino acid 
transporters (E/\ATs) (Kanner and Sharon, 1978) terminates 
glutamatergic synaptic transmission and regulates glutamate 
concentrations within the CNS. E/\ATs can also function as 
anion-selective channels (Fairman et al., 1995; Wadiche and 
Kavanaugh, 1998), with E/\AT anion channels regulating cell 
excitability and synaptic transmission (Picaud et al., 1995). Their 
physiological relevance is emphasized by the recent discovery 
that altered E/\AT anion conduction is associated with episodic 
ataxia and epilepsy (Winter et al., 2012). 
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EAAT anion permeation occurs through a defined anion-selec- 
tive conduction pathway (Kovermann et al., 2010), which is 
opened and closed through conformational changes coupled 
to transitions within the glutamate uptake cycle (Bergles et al., 
2002; Machtens et al., 2011a; Otis and Kavanaugh, 2000). The 
channels are perfectly anion selective (Wadiche and Kavanaugh, 
1 998) and exhibit unitary current amplitudes, which are small but 
of a similar size range to those of specialized anion channels 
(Schneider et al., 201 4). The five mammalian E/\ATs differ in their 
relative glutamate transport rates and anion currents, resulting in 
isoform-specific differentiation into efficient transporters associ- 
ated with small macroscopic anion currents and low-capacity 
transporters that predominantly conduct anions (Mim et al., 
2005). However, the functional properties of the underlying anion 
channels are very similar for each type (Schneider et al., 2014; 
Torres-Salazar and Fahike, 2007), indicating conservation of 
the anion-conducting pore among functionally specialized trans- 
porters. So far, the localization of this conduction pathway, the 
underlying conformation of the transporter, and the mechanisms 
of anion permeation have not been identified. 

We used molecular dynamics (MD) simulations to identify 
which conformations of the archeal glutamate transporter homo- 
log GItph (Yernool et al., 2004) permit anion permeation and to 
characterize the molecular features of anion conduction. We 
analyzed the conformational changes leading to the formation 
of an anion-selective pore and observed ion permeation along 
this path in simulations. Using mutagenesis, fluorescence spec- 
troscopy experiments on GItph and patch-clamp recordings on 
mammalian E/V\Ts, we confirmed that the anion channel confor- 
mation we identified exists under experimental conditions and 
that this permeation pathway is utilized by both prokaryotic 
and mammalian glutamate transporters. 

RESULTS 

Molecular Dynamics Simulations Identify Anion- 
Conducting Conformations of GItph 

GItph shares about 37% sequence Identity with mammalian 
E/V\Ts and is an accepted model of this class of transporters 
for studying secondary active transport and anion conduction 
(Boudker et al., 2007; Groeneveld and Slotboom, 2010; Ryan 
and Mindell, 2007; Yernool et al., 2004). High-resolutlon crystal 
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Figure 1. An Anion-Selective Conduction Pathway Is Formed by a Substrate Transport Intermediate 

(A) Red isodensity meshes illustrate the Cl“ distribution (0.2 ct) in MD simulations at +1.6 V around substrate-bound GItph monomers in various conformations 
(trimerization domain shown in blue; transport domain shown in yellow). The other two monomers, water molecules, lipids, and ions were omitted for clarity. 
Nonconductive outward-facing (OFC), inward-facing (IFC), and intermediate conformations (ICs; derived using essential dynamics sampling of the transition from 
OFC to IFC) are shown in side view. 

(B and C) Ion permeation (B) and conformational change (C) of ICoen during transition to an open channel conformation (overlay of ICoen and ChCoen, top view) 
upon application of a membrane potential (±1 .6 V). 

(D) Transitions of ICs into open channel conformations (ChCs) containing an anion-selective pathway occur at positive and negative potentials. 

(E) Visualization of all trajectories (the OFC-IFC translocation/essential dynamics sampling simulation and separate MD simulations of OFC, ICout, ICorystai, ICoen. 
ICint, IFC, and oftheChCs of the intermediates) in the principal component space by projection onto the first (EV1) and fourth (EV4) eigenvectors, corresponding to 
translocation and pore formation, respectively. Black dots represent nonconducting conformations. Blue, green, and orange dots denote frames in MD tra- 
jectories, where Cl“ permeation through the respective ChC conformation was observed. Note that the point density is biased by the number and length of 
simulations initiated from the various starting conformations (red circles) and therefore does not provide information on energetics. 

See also Eigures S1 , S2, and S3 and Movie S1 . 



structures revealed a trimeric assembly, with each subunit con- 
taining eight transmembrane helices (TM) and two hairpin loops 
(HP) (Yernool et al., 2004). Analysis of different conformations 
demonstrated that substrate translocation involves a large-scale 
(~18 A) rotational translational movement of the substrate- 
harboring transport domain relative to the static trimerization 
domain (Crisman et al., 2009; Reyes et al., 2009). 

We used all-atom MD simulations capable of directly simu- 
lating ion flux driven by transmembrane voltages (Kutzner 
et al., 2011) to investigate anion permeation in substrate-bound 
GItph. Simulations were performed using various GItph conforma- 
tions in the presence of 1 M NaCI on either side of the membrane. 
Positive and negative membrane potentials (initially ±1 .6 V; later 
±800 mV) applied to increase anion permeation rates had no 
detrimental effects on the stability of the system (Figure SI avail- 
able online), in good agreement with the results of other simula- 
tion studies (Jensen et al., 2012). Within a total simulation time of 
>8 ps, no Cr permeation events were observed for the known 



outward- (OFC) and inward-facing (IFC) conformations at mem- 
brane potentials up to ±1 .6 V, indicating that none of these states 
is anion conducting (Figures 1 A and 1 B; Extended Experimental 
Procedures). We concluded that translocation intermediates 
might correspond to the precursors of anion-conducting confor- 
mational states and simulated the OFC-IFC transition to obtain 
novel intermediate conformations (ICs) using essential dynamics 
sampling (Amadei et al., 1996). In these simulations, the trans- 
porter was driven along the first eigenvector (EV1)— representing 
transmembrane translocation of the transport domain— from 
a principal component analysis (PCA) of the conformational 
changes of transporter monomers in simulations on CFC and 
IFC (Figures 1A, SIC, and SID). Because individual subunits 
function independently within the trimeric assembly (Erkens 
et al., 2013; Grewer et al., 2005), translocation simulations 
were performed on a single monomer— with the other two 
remaining in the CFC. These simulations correctly sampled 
the recently crystallized GItph intermediate (ICcrystai. minimum 
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Figure 2. Hydrophobic Gating of the Anion Permeation Pathway 

(A) Representative (spheres) and averaged (black mesh) water distribution in 
the transport/trimerization domain interface of GItph in various conformations 
(side view). 

(B) Voltage-dependent occupancy of the interface core region by water mol- 
ecules (counted within a cylindrical slab; each data point corresponds to a 
100 ns simulation). Note that increased water numbers in ChC, but not in IFC, 
result in the formation of a continuous water bridge between the extra- and 
intracellular space (A). Error bars indicate the SD of the water molecule counts 
during the simulations. 



monomeric root-mean-square deviation [RMSD] of 1.3 A). This 
demonstrates the existence and stability of transiocation 
intermediates (Verdon and Boudker, 2012) and validates the 
simuiated transition pathway (Figure S2C). We then chose three 
intermediates, iCout (simiiar to iCorystai), ICcen. and iCint, from our 
trajectory that were equaiiy distributed between the OFC and 
iFC, and, together with ICcrystai, subjected them to further MD 
under transmembrane voitage (Figures 1 A and SI). 

AN intermediates were impermeable to CF and remained ciosed 
for hundreds of nanoseconds in the absence of membrane 
voitage (Figures 1A and S1G; Extended Experimentai Proce- 
dures). Flowever, for ICcrystai, ICcen, and ICint, lateral movement 



of the transport domain occurred 70-300 ns after applying 
membrane potentials >±1.3 V. These conformational changes 
resulted in open channel conformations (designated here as 
ChCcrystai, ChCcen, or ChQnt) that were centrally localized on the 
translocation reaction coordinate and that exhibited an anion-se- 
lective conduction pathway at the interface between the trimeri- 
zation and transport domains, near the tip of FIP1 (Figures 1 B- 
1D; Movie SI). Pore opening and closing always reversed after 
changing the applied voltage, and neither protein instability nor 
electroporation through the lipid bilayer were observed (Figures 
SI E-SI FI and S3A-S3E). Pore opening occurred from various in- 
termediate conformations, however, with different opening pro- 
pensities in the order of ICout < ICcrystai < < ICint < ICcen- For ICout 
and ICcrystai, channel opening was never or only once observed 
(0 out of 4 [ICout] or 1 out of 5 [ICcrystai, Of +1 -6 V] Simulations) within 
n-^400 ns for each (Figures S2D and S2E). In contrast, such transi- 
tions were regularly seen for ICcen and IC.nt (20 out of 20 [ICcen, at 
1.3-1. 6 V] or 4 out of 4 [ICint, -1.6 V] simulations) (Figure IB). To 
further analyze the conformational changes underlying channel 
opening and to relate them with translooation of GItph, we per- 
formed an additional PCA on all data, including the previously 
used set of OFC and IFC trajeotories, the translocation simula- 
tions and all simulations of intermediates under membrane volt- 
ages. In addition to EV1 —which remained unchanged compared 
with the first PCA and represents translocation— we found that 
conformational changes along the fourth eigenvector (EV4) corre- 
lated with the onset and ending of anion permeation (Figures 
S3C-S3E). We plotted the position taken up by the trajectories 
in the prinoipal component space, as defined by eigenvectors 
EV1 and EV4, which describe conformational changes attributed 
to translocation and anion channel gating, respectively (Fig- 
ure IE). Although originating from different intermediate states, 
open channel conformations ChCcen and ChCint had RMSDs 
approaching 1.0 A with similar overall structures and will be 
treated as a single conformation (ChC) (Figures 1 E and S3F-S3FI). 

Formation of the anion conduction pore at the interface be- 
tween the transport and trimerization domains is aocompanied 
by extensive hydration of this region and the creation of a contin- 
uous water bridge spanning the membrane (Figure 2A). In each 
of our simulations, this process was reversible with channel 
closure preceded by complete dewetting. Water entry is pro- 
moted by both positive and negative potentials with voltage-in- 
dependent water occupancy between -400 and +400 mV 
(Figure 2B). The hydrophobic environment of this region (see 
below) is expected to represent a barrier to anion permeation 
that can be dynamically lowered by the entry of water molecules. 
Wetting of the rather hydrophobic interface region might 
compensate for the energetic cost of breaking hydrophobic in- 
teractions between the surfaces of the trimerization and trans- 
port domains during the conformational change that broadens 
the interface. Therefore, we suggest that channel opening and 
closing is mediated by a combination of steric and hydrophobic 
gating, as has been demonstrated for some other ion channels 
(Jensen et al., 2012; Vaitheeswaran et al., 2004). 

Structural Determinants of GItph Anion Permeation 

The GItph anion conduction pathway has a distorted hourglass 
shape, with large extra- and intracellular entrance cavities that 
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narrow to a more constricted conduction path almost perpen- 
dicular to the membrane (Figures ID, 3A, and 3B; Movie SI). 
Pore-forming residues are highly conserved between GItph 
and mammalian EAATs (Figure S4). This level of conservation 
is consistent with the functional similarity between GItph (Ryan 
and Mindell, 2007) and EAAT anion channels (Melzer et al., 
2003; Wadiche and Kavanaugh, 1998) and accounts for similar 
unitary current amplitudes among EAATs with large glutamate 



transport and small anion current compo- 
I nents and those with predominant anion 
conductance (Schneider et al., 2014; 
Torres-Salazar and Fahike, 2007). Most 
side chains lining the pore center are hy- 
drophobic, except for R276, which pro- 
trudes from the tip of HP1 into the CF 
density (Figures 3A and 3B). EAATs lack 
a positive side chain at the position corre- 
sponding to R276, but contain arginine at 
positions homologous to M395 of GItph 
(Ryan et al., 2010) (Figure S4). MD simulations of the R276S- 
M395R GItph mutant (see below) showed that both the arginine 
in the “EAAT position” and R276 project their side chain toward 
the same location, resulting in conservation of the positive 
charge at this site in the tertiary structure of EAATs and GItph 
(Figures 3A and 3B). 

Starting from the ChCcen conformation, we simulated anion 
permeation at reduced voltages of ±^800 mV and observed 
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Figure 3. Structural Determinants of GItph 
Anion Permeation 

(A) Cartoon representation of a ChCcen monomer 
(light blue, trimerization domain; yellow, transport 
domain) in side view from the subunit interface, 
with pore-lining side chains shown as sticks (blue, 
positive: red, negative; green, polar; gray, apolar). 
TM2 and TM5 are partially omitted for clarity. Red 
spheres represent snapshots of a single perme- 
ating cr ion. 

(B) Close-up of the permeation pathway from the 
TM4-TM5 loop. Coloring as in (A), including 
representative water molecules found in the inner 
hydration shell of permeating Cr ions. 

(C) Count of Cl“ and \~ permeation events through 
ChCcen and ChCjpt at +800 and -900 mV (dashed 
lines), respectively. 

(D) EAAT4 current voltage plots for various sym- 
metrical [NaCI]. Single-channel currents were 
determined by multiplying whole-cell Cr currents 
recorded 1 ms after the voltage jump (means ± SE, 
n > 10 for each condition) by the ratio of experi- 
mentally measured unitary current (Figure 6B) and 
mean current amplitudes (n = 12) at +150 mV in 
140 mM N 03 ~. The experimental data (symbols) 
were globally fitted using a three-binding site 
Eyring rate model (lines; Extended Experimental 
Procedures). The inset displays GItph unitary cur- 
rent amplitudes (red symbols) from MD simula- 
tions using 1 M NaCI and the extrapolation of 
the experimental data to these conditions by the 
Eyring model (red line). 

(E) Pore profile of anion hydration, pore diameter, 
and Poisson-Boltzmann energies for Na"^ and Cr 
of WT and R276S and of WT GItph in an apo state, 
i.e., after removal of aspartate and Na"^ ions (in 
ChCcen)- Hydration numbers are the average 
number of hydrogens within the first Cr hydration 
shell, cr isodensity meshes around ChCcen (4.2a) 
illustrate two Cl“-binding sites at the channel en- 
trances, denoted Sext and Sjnt- 

See also Figure S4. 
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perfect anion selectivity (Figure 3C). In 1 M NaCI, our simula- 
tions yielded single-channel anion currents of 42.4 ± 6.3 pA 
(~+800 mV) or 51 .4 ± 6.7 pA (~-900 mV). These voltages and 
salt concentration are too high to permit direct experimental 
verification of the conductances. We measured EAAT4 anion 
currents at 140-750 mM NaCI and at voltages up to 500 mV 
(Figure 3D) to extrapolate the voltage dependence of EAAT4 uni- 
tary anion currents to the MD conditions. Comparing these 
experimental EAAT4 and simulated GItph current-voltage rela- 
tionships demonstrated that simulations reproduce the experi- 
mental unitary current amplitudes within the same order of 
magnitude (Figure 3D, inset). Substitution of CF by F in the sim- 
ulations resulted in significantly higher anion currents of 95.0 ± 
5.4 pA (-+800 mV) or of 97.1 ± 10 pA ( — 900 mV; Figure 3C); 
however, the transport substrate aspartate did not permeate 
within 200 ns at concentrations of ~500 mM. Simulated perme- 
ation properties thus closely resemble the functional character- 
istics of mammalian EAATs (Melzer et al., 2003; Wadiche and 
Kavanaugh, 1998). 

The electrostatic Poisson-Boltzmann energy profile for mov- 
ing an ion along the channel axis displays much higher energy 
barriers for Na"^ than for CF (Figure 3E). Energy wells at both en- 
trances with high CF densities represent CF binding sites, de- 
noted Sext and Sint. The critical role of R276 in anion selectivity 
is demonstrated by the convergence of Na"^ and CF energy bar- 
riers upon removal of the positive charge in R276S GItph (Fig- 
ure 3E). Energy profiles are identical in both the presence and 
absence of bound aspartate/Na"^ at their binding sites (Figure 3E). 
Simulated ion permeation through this apo state revealed similar 
CF permeation rates along the same permeation pathway (data 
not shown), consistent with the experimentally determined uni- 
tary conductances of EAATs being indistinguishable in the pres- 
ence and absence of substrate (Kovermann et al., 2010). The 
conduction pathway is rather wide with a minimum diameter of 
5.6 A, such that anions can permeate in a partially hydrated 
state, and CF-Flwater coordination numbers show only a small 
decrease from 6.8 in bulk solution to 5.2 in the GItph channel cen- 
ter (Figure 3E). 

Tryptophan-Scanning Mutagenesis Reveals Direct 
Interactions of Predicted Pore-Forming Residues with 
Permeant Anions 

We used a combination of tryptophan-scanning mutagenesis 
and iodide quenching (Vazquez-lbar et al., 2004) to test whether 
permeating anions come into close contact with amino acid 
side chains projecting into the proposed anion conduction 
pathway. I" readily permeates through GItph and EAAT anion 
channels and is therefore expected to come into close proximity 
to residues forming the permeation pathway. Because 1“ can 
reduce tryptophan fluorescence via direct interactions, i.e., 
collisional quenching (Lakowicz, 2006), iodide quenching of 
tryptophan fluorescence is a suitable method to experimentally 
verify the simulated GItph anion permeation pathway. As GItph 
lacks endogenous tryptophans, we generated single-tryptophan 
mutants by substituting 13 residues that protrude from the 
trimerization domain into the interface region of GItph ChCcen 
(Figure 4A). To avoid interference with substrate binding, we 
did not insert tryptophan residues into the transport domain. 



With the exception of S65W, F reduced fluorescence in all GItph 
mutants in a concentration-dependent manner. Figure 4B shows 
the spectral properties of V51W and S65W GItph in detergent 
micelles and their modification at various [l“]. The identical con- 
centration dependences of the fluorescence lifetimes and inten- 
sities indicate that F quenches tryptophan fluorescence via a 
collisional mechanism (Figure 4B, inset). 

Figure 4A maps the relative quenching (Fq/F) at [1^ = 350 mM of 
the tested tryptophan residues on the ChCoen structure. These 
data demonstrate the high iodide accessibility of residues close 
to the proposed anion permeation pathway, which is reduced 
with increasing distance. Linear concentration dependences of 
Fq/F in Stern-Volmer plots are expected for proteins with a single 
tryptophan, which adopt only one conformation. The observed 
deviations from linearity indicate that tryptophan-substituted 
GItph mutants assume multiple conformations that differ in 1“ 
accessibility (Figure 4C). These findings support the notion that 
tryptophan-substituted GItph mutants exhibit similar degrees of 
conformational heterogeneity to wild-type (WT) GItph. 

Figure 4D shows plots of the calculated anion accessibilities 
in simulations of different conformations compared with exper- 
imentally observed fluorescence quenching. Most residues 
are accessible in multiple conformations. Flowever, there are 
three residues— W50, W54 (projecting directly into the anion 
pore), and W62 (at a more peripheral location)— with rather 
exclusive anion accessibility in ChC (Figures 4D and S5; 
see the Extended Experimental Procedures for details on the 
calculation of anion accessibilities from the simulations). For 
these constructs, a modified Stern-Volmer analysis was used 
to determine the fraction of fluorescence quenchable by F to 
be ~20% (Figures S5A and S5B). Because different protein 
conformations could exhibit different quantum yields of the 
inserted tryptophan, this value is not always identical to the 
probability of the protein assuming this accessible conforma- 
tion. Flowever, these data indicate that GItph can assume 
the anion-conducting channel conformation ChC even in the 
absence of an applied voltage and that this conformation is 
sufficiently stable to permit F collisions with side chains that 
project into the anion conduction pathway. 

Mutations of Pore-Forming Residues Affect Unitary 
Anion Current Amplitudes and Anion/Cation 
Permeability Ratios 

To provide further verification of the predicted anion permeation 
pathway, we compared the effects of amino acid exchanges on 
simulated and experimental permeation properties. We chose 
experimental measures corresponding to parameters obtained 
from MD simulations: these included the single-channel conduc- 
tance and the anion/cation selectivity of the anion channel. In 
contrast, macroscopic current amplitudes alone, e.g., from 
whole-cell recordings, do not permit a distinction to be made be- 
tween mutations that alter the anion permeation rate or those 
that affect the probability of assuming an open anion channel 
state and therefore preclude a direct comparison with simulation 
results. Because of difficulties in cellular expression systems, 
high-resolution electrophysiological recordings of GItph are not 
yet feasible. Assays that were developed to describe GItph anion 
conductance (Ryan and Mindell, 2007) only provide information 
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Figure 4. Tryptophan Fluorescence 
Quenching by Iodide in GItph 

(A) Overview of GItph single-tryptophan insertions 
(ChCcen in side view). Side chains are color-coded 
according to the reduction in fluorescence in- 
tensity at 350 mM [l“] (Fq is the intensity in the 
absence of T; n > 5 for each). Red mesh repre- 
sents the Cr density observed in MD (Figure ID). 

(B) Representative fluorescence spectra of 
WT, V51W, and S65W GItph at various [!“]. The 
inset shows the comparable concentration 
dependence of V51W fluorescence lifetimes and 
intensities in a Stern-Volmer plot, indicating a 
collisional quenching mechanism. 

(C) Stern-Volmer plots for all tryptophan mutants 
(means ± SE; n > 5 for each). 

(D) Comparison of fluorescence quenching (gray 
bars; n > 5 for each) with MD-predicted anion 
accessibilities of side chains in various confor- 
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about macroscopic anion currents through an ensemble of 
multiple GItph transporters. Because the functional properties 
of GItph (Ryan and Mindell, 2007) and EAAT anion channels are 
very similar (Melzer et al., 2003; Wadiche and Kavanaugh, 
1998) and the pore-lining residues are highly conserved (Fig- 
ure S4), it is reasonable to assume that the proposed Gltp^ anion 
permeation pathway is also responsible for EAAT anion conduc- 
tion. We therefore compared the effects of in silico mutagenesis 
on simulated GItph anion conductance and anion/cation selec- 
tivity with experimental data on mammalian EAAT2/EAAT4. Sin- 
gle-channel recordings have not yet been possible for these 
transporters, but unitary current amplitudes can be determined 
by noise analysis of whole-cell current recordings, and anion/ 
cation selectivities can be obtained through reversal potential 
measurements at various ionic conditions (Melzer et al., 2003). 



nations (the different symbols show the average 
□ V51W number of Cr ions ± SD within 13 A of the side 

A L212W chains). Residue numbers on the abscissa are 

^ ordered according to their positions in the mem- 

• I61W 

O V201W brane plane shown in (A). 

< F50W See also Figure S5. 

■ new 
▼ V62W 
#L54W 
T V209W 

• Y195W 

• L66W We initially screened for mutations that 

• S65W affect pore properties in silico. Pore-form- 

ing residues were identified by a geomet- 
rical criterion (within a distance <6.9 A to 
the pore center defined by R276). We 
excluded only a few residues that are 
located within the transport domain and 
known to be crucial for substrate binding 
(e.g., residues in HP2 to prevent interfer- 
ence with substrate binding). We gener- 
ated 29 GItph pore mutants in silico, 
including S65 and 161 , which are close to 
a recently discussed alternative location 
of the anion channel (see below) and sub- 
jected them to MD simulations. We identi- 

fied side-chain substitutions that increase 
or decrease unitary conductances (II 6E/ 
K, L20E, F50LVK/D, V51D, L54D, 161 D, 
A205D, R276S/D) or modify the anion/cation selectivity (F50D, 
A205D, R276S) (Figure 5). We then performed whole-cell 
patch-clamp recordings of 33 EAAT2/EAAT4 mutants (Figure S6). 
Because most mutations also affected anion channel gating (Fig- 
ure S6), a direct comparison of whole-cell currents and MD data 
was not feasible. Flowever, ten EAAT4 mutants exhibited suffi- 
cient time- and voltage-dependent gating to allow single-channel 
conductances to be determined using nonstationary noise anal- 
ysis (Figure 6). Four charge-altering mutations, L20E, II 6K, II 6E 
and 161 D, increased or decreased the simulated CF permeation 
rate of GItph to a similar extent as alterations in experimental sin- 
gle-channel conductance caused by the homologous EAAT4 
mutations. For four GItph mutants, VI 2E, II 6W, and S65A, exper- 
imental and simulated unitary conductances were unaltered (Fig- 
ure 6B). Interestingly, II 6E— located In the intracellular part of the 
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Figure 5. MD Screening of Pore-Lining Residues Predicts Mutations that Affect Anion Conductance and Anion/Cation Selectivity 

(A and B) Stick representations of pore-lining residues in side view (A) or top view (B) and colored as in Figure 3A, including detailed GItph residue number labels. 
(C) Summary of simulated NaVCr conductances for various GItph mutants {ChCcenl means ± SD; MD times range from 120 to 500 ns for each mutant) at 
~+800 mV. 



GItph anion conduction pathway— selectively reduced outward 
fluxes of anions in a valve-like manner, as demonstrated by out- 
ward current rectification in the II 6E GItph and homologous T59E 
EAAT4 mutants. In contrast, for the neighboring residue VI 2, 
which is closer to the bulk solution and further from the pore cen- 
ter than 116 (Figure 5A), conversion to glutamate did not affect 
unitary conductances in GItph or in F55E EAAT4. We furthermore 
found four “semiconserved” side chains that are conserved 
within mammalian EAATs but differ in GItph: the aforementioned 
R276 residue (the corresponding EAAT arginine is located at 
the M395 position in GItph), F50 (L in EAATs), and M94 (V in 



EAATs; Figure S4). We constructed GItph and EAAT4 mutants 
to reverse these evolutionary exchanges and observed recip- 
rocal effects on conductance, as would be expected if direct 
interactions exist between these side chains and permeating 
anions (Figure 6C). 

The simulated GItph mutants F50D, A205D, and R276S ex- 
hibited Na"^ permeation along the same path as CF (Figures 7A 
and 7C). Because some of the corresponding EAAT4 mutations 
prevented theirfunctional expression in cells, homologous muta- 
tions were introduced into EAAT2, a transporter with unitary cur- 
rent properties similar to those of EAAT4 (Schneider et al., 201 4). 
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Figure 6. Mutations of Pore-Forming Residues Modify Experimental EAAT 4 Anion Conductances and Anion/Cation Selectivity 

(A) Representative nonstationary noise analysis of T59K EAAT4, showing current responses to 300 repeated voltage jumps (top) and the resulting current var- 
iances (middle). Bottom, linearly transformed current-variance plot (background noise at 0 mV was subtracted from the total variance). Red line shows a linear fit. 

(B) Experimental EAAT4 (gray; from whole-cell recordings and nonstationary noise analysis; means ± SE; n = 6-9) and simulated GItph unitary conductances of 
WT and homologous mutants (in color; means ± SD). Ordinates were scaled to show experimental WT EAAT4 conductances at +1 50 mV and simulated WT GItph 
conductances at +800 mV at the same level (gray line). 

(C) Changes in experimental EAAT4 (n = 6-8) and simulated GItph unitary conductance upon substitution of residues that are conserved in EAATs, but not in GItph 
(Figure S4). 

See also Figures S6 and S7 and Table SI. 



Varying the external [Na^ led to changes in ion current reversal 
potentials in cells expressing L85D, S288D, and R476M 
EAAT2, indicating that EAAT2 mutants represent unselective 
channels with varying degrees of relative cation selectivity (Fig- 
ure /B). In these experiments, coupled glutamate transport, 
which would additionally affect the reversal potential, was abol- 
ished by using a K'^-free intracellular solution. 

The effect of these negative charge mutations was site-spe- 
cific as demonstrated by experiments and simulations with 
A362D EAAT2 and the corresponding R276D-M395R GItph. 
A362 in EAAT2 is homologous to R276 in GItph, whose positively 
charged side chain is crucial for anion selectivity (Figure 3). To 
achieve the electrostatic potential in GItph to be similarly modi- 
fied as in A362 EAAT2, we inserted an arginine at the “EAAT po- 
sition” M395, in addition to the R276D mutation (Figures 3A and 
S4). R276D-M395R Gltp^ and A362D EAAT2 exhibited perfect 
Cr selectivity in both simulations and experiments, indicating 
that anion selectivity is only impaired by the insertion of nega- 
tively charged side chains at specific positions (Figures 5 and 7). 

The Novel Anion Channel Conformation Enables a 
Reinterpretation of Previous Functional Data and Is 
Compatible with Published Crosslinking Results 

Prior to our work, the structural basis of EAAT/GItph anion con- 
duction was unknown. Flowever, because mutations around 
S65 were reported to affect anion permeation of both GItph 
(Ryan and Mindell, 2007) and EAAT1 (Cater et al., 2014; Ryan 
et al., 2004), and because crystallographic data (Verdon and 
Boudker, 2012) suggested the existence of an aqueous cavity 
in ICcrystai, it has been hypothesized that anions permeate along 
a pathway that we will refer to as “S65 path” (Figure S7A). Asyet, 
no other EAAT anion permeation pathway has been proposed. 



We performed MD simulations and experimental approaches 
to test whether anion permeation along the “S65 path” might 
contribute to EAAT/GItph anion conduction. MD simulations 
demonstrated water access but no Cl“ density along the “S65 
path” in ChC (Figure S7A). Pore searching algorithms (see the 
Extended Experimental Procedures) did not identify any addi- 
tional candidate anion pore in the S65 region. Mutations of S65 
did not affect anion conductance in MD simulations (Figure 5). 
In our fluorescence assay, S65W GItph fluorescence was 
not quenched by iodide (Figure 4). Whereas the homologous 
SI 08V EAAT4 mutant was mostly retained in intracellular com- 
partments, SI 08A EAAT4 was robustly expressed on the surface 
of mammalian cells, with resulting current amplitudes compara- 
ble with those of WT EAAT4. S108A EAAT4 exhibited altered 
anion channel gating but unaltered unitary current amplitudes 
(Figures 6B and S6). These results indicate that mutations of 
S65/S108 do not affect the single-channel conductance itself 
but instead alter the channel open probability, i.e., the rates of 
reactions leading to the open anion channel. 

We generated three additional EAAT4 mutants (V101D, 
LI 04D, and N297D) with negatively charged side chains projec- 
ting into the “S65 path.” Mutant channels exhibited altered 
voltage- and glutamate-dependent gating but were still gluta- 
mate sensitive and cation impermeable (Figures S6B, S6C, 
S7A, and STB). One mutation in this region, 161 D Gltph/L104D 
EAAT4, even increased anion permeation rates in both simula- 
tions and experiments (Figures 5 and 6B). Because this residue 
does not directly line the CF permeation pathway, which re- 
mained unchanged upon 161 D substitution, and because the 
introduction of a negative charge increases anion conductance, 
we deduce that this mutation indirectly affects the anion channel 
function. We conclude that the mutated amino acids surrounding 
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S65 in GItph do not line the EAAT anion pore, although they do 
influence the conformational changes underlying the probability 
of the channel being open. 

A recent study demonstrated that crosslinking a substituted 
cysteine within the transport domain to another in the trimeriza- 
tion domain abolishes EAAT3 glutamate transport but does not 
abrogate substrate-dependent anion conductance (Shabaneh 
et al., 2014). The authors concluded that, starting from OFC, a 
limited inward movement of the transport domain is sufficient 
for formation of an anion conducting conformation. Cysteines 
were inserted at positions corresponding to residues 216 and 
391 in GItph. These residues are in close proximity in OFC, ICout, 
and ICcrystai- Because MD simulations demonstrated a pro- 
nounced increase in the V216-A391 C„ distance to >7 A during 
transmembrane translocation and channel opening (Figure S7C), 
this disulfide link might prevent transitions into the ICcen. ICint, 
ChC, and IFC states. To evaluate the effects of this disulfide 
bridge on the conformational changes underlying anion channel 
opening, we performed simulations on an Intermediate confor- 
mation of our translocation trajectory that is located at the 
most central position along the translocation axis (to increase 
the likelihood of pore opening) but maintains a distance between 
these two residues of <7 A (Figures 1 and S7C). The crosslink- 
age was modeled by a distance restraint on the two C„ atoms 
within monomers (Figures S7C and S7D). Simulations of the 
V216-A391 crosslinked GItph model showed that this disulfide 
link limits the lateral movement of the transport domain but per- 
mits sufficient conformational flexibility for pore opening and 
anion permeation along the identified anion conduction pore 
(Figures S7E and S7F). The experimental effects of this crosslink 
on transport and anion currents In EAAT3 (Shabaneh et al., 201 4) 
are therefore fully consistent with the GItph anion permeation 
pathway presented here. 




Figure 7. Conversion of the EAAT2 Anion 
Pore into a Cation-Conducting Channel 

(A) Cr (red) and Na"^ (blue) distributions (a = 0.2) 
around WT and F50D Gltp^. Residues described in 

(B) and (C) are shown as sticks. 

(B) Variations in current reversal potentials with 
external [Na^ demonstrate the cation permeability 
of L85D EAAT2 — homologous to F50D Gltpt,— 
S288D, and R476M (but not of WT and A362D) 
EAAT2 anion channels (means ± SE; n = 6-1 3 for 
each). 

(C) Ratio of simulated cation/anion permeation 
events for WT and corresponding GItph mutants 
(means ± SD; RSMR, R276S-M395R), colored 
according to (B). The F50D mutant was tested with 
the arginine at position 276 (WT) and in the context 
of the EAAT arginine position (RSMR mutation). 
See also Figures S6 and S7 and Table SI . 



DISCUSSION 

EAAT glutamate transporters are proto- 
typical dual function proteins that oper- 
ate as both secondary active transpor- 
ters and anion-selective ion channels. 
Whereas the key structural features of secondary active gluta- 
mate transport have been established (Akyuz et al., 2013; Crls- 
man et al., 2009; Reyes et al., 2009; Shrivastava et al., 2008), 
structural and mechanistic details of anion permeation have 
been hitherto unknown. In this study, we used a combination 
of computational and experimental approaches to determine 
how this class of transporters mediates anion permeation 
through an aqueous conduction pathway. MD simulations iden- 
tified an open channel conformation of GItph that was consis- 
tently formed from various ICs by the lateral movement of 
the transport domain (Figure 1). Opening of the interface be- 
tween the transport and trimerization domains is followed by 
voltage-promoted water entry (Figure 2) and the formation of 
an anion-selective conduction pathway (Figure 3). We verified 
the predictions of our simulations by fluorescence spectroscopy 
and functional studies using mutant transporters. Fluorescence 
quenching experiments demonstrated that tryptophan residues 
substituted at positions that project into the predicted conduc- 
tion pathway come into close contact with permeating anions 
(Figure 4). Moreover, substitution of pore-forming residues had 
comparable experimental effects on the two key characteristics 
of an anion-selective conduction pathway, i.e., anion/cation 
selectivity and ion permeation rates, as predicted by simulations 
(Figures 5, 6, and 7). These data indicate that pore-forming res- 
idues identified through simulations are indeed the major deter- 
minants of anion permeation and selectivity in both GItph and 
EAATs. Moreover, they demonstrate that this anion conduction 
pathway is conserved throughout the glutamate transporter fam- 
ily. Our data thus clarify how a class of secondary active trans- 
porters can function as anion-selective channels that are gated 
by transitions In the transport cycle. 

The ion conduction pathway reported herein accounts for 
all known functional properties of EAAT/GItph anion channels. 
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Simulations reveal unitary current amplitudes and ion selectiv- 
ities (Figures 3C and 3D) that resemble experimental results 
(Melzer et al., 2003; Wadiche and Kavanaugh, 1998). The calcu- 
lated minimum pore diameter of GItph is ^5.6 A (Figure 3E), 
which perfectly fits the predicted minimum pore diameter of 
>5 A based on anion substitution experiments on EAAT1 anion 
channels (Wadiche and Kavanaugh, 1998). Rapid substrate 
application experiments have shown that EAAT anion channel 
activation is delayed compared with glutamate translocation 
(Grower et al., 2000). These findings indicate that anion-con- 
ducting states existing “outside” the glutamate uptake cycle 
can be explained by channel opening as a branching reaction 
from ICs (Figure 1). Simulations predict voltage independence 
of anion channel opening within the physiological voltage range 
(Figure 2B). This result explains the experimental observation 
that the voltage- and substrate dependence of EAAT anion chan- 
nels are tightly linked to transitions within the transport cycle 
(Bergles et al., 2002; Machtens et al., 2011a). Simulated anion 
permeation is unchanged in both the presence and absence of 
bound substrate (Figure 3E), as expected from the experimental 
unitary conductances being indistinguishable in the presence 
and absence of glutamate (Kovermann et al., 2010). Because 
anion channel opening is tightly linked to translocation of the 
transport domain, our results indicate that transport substrates 
increase EAAT anion currents by promoting intermediate states. 
Distinct EAAT isoforms differ strongly in the relative amplitudes 
of their transport and anion currents (Fairman et al., 1995; Mim 
et al., 2005). Flowever, analysis of unitary current amplitudes re- 
vealed similar single-channel amplitudes (Schneider et al., 201 4; 
Torres-Salazar and Fahike, 2007). The high degree of conserva- 
tion of pore-forming residues (Figure S4) is consistent with the 
similarities in anion channel unitary current amplitudes and 
selectivity of different transporter isoforms. Lastly, the novel 
anion conducting conformation can account for all published 
mutagenesis and crosslinking results on EAAT anion conduction 
(Ryan et al., 2004; Shabaneh et al., 2014). 

The “S65 path” (Figure S7) is the only location of the anion 
channel that has been discussed in recent years. We could not 
find any indication for a direct contribution of this region to anion 
permeation. Our simulations show that the “S65 path” is hydrat- 
ed in ChC, thereby suggesting that S65 and adjacent residues 
could be involved in facilitating the opening of the transport/tri- 
merization domain interface instead. We thus speculate that 
the “S65 path” may modulate formation of the ChC conforma- 
tion, which provides an explanation for the impact of mutations 
in this region on anion channel function (Cater et al., 2014; 
Ryan and Mindell, 2007; Ryan et al., 2004). 

The positive electrostatic potential necessary for perfect anion 
selectivity of EAAT/GItph anion channels is provided by a single 
positively charged side chain, R276. Surprisingly, during evolu- 
tion, this arginine has moved from the tip of HP1 in GItph to 
TM8 in EAATs, while retaining a similar side chain position in 
the tertiary structure. In GItph, as well as in EAATs, this arginine 
has been implicated in binding amino acid substrates, as well 
as binding Na"^ and (Ryan et al., 2010; Verdon et al., 2014). 
Unitary anion conductance is not affected by aspartate (Figure 3), 
indicating that the interaction of R276 with transport substrates 
does not modify its effect on anion conduction and selectivity. 



The tight linkage between anion channel gating and glutamate 
transport in EAAT/GItph was previously explained by assuming 
that certain states of the transport cycle are anion conducting 
(Bergles et al., 2002). Because GItph structures did not exhibit 
an open pore with dimensions that might account for the exper- 
imentally observed anion conduction properties, it was recently 
suggested that additional yet to be defined ICs that occur during 
translocation might be anion conducting (Cater et al., 2014). We 
have now demonstrated that intermediate transport conforma- 
tions are nonconducting and that EAAT/GItph anion channel 
opening transitions require the lateral movement of the gluta- 
mate transport domain together with pore hydration from inter- 
mediates. Anion channel opening is therefore not part of the 
transport cycle, but instead is achieved via a branching confor- 
mational change. This design permits rapid transition through 
the full transport cycle without anion channel opening. Further- 
more, it allows certain EAAT isoforms to function as effective 
transporters, with low anion channel open probabilities, and 
other isoforms to have low transport rates but high occupations 
of the anion channel mode. 

The unique mechanism of EAAT anion channel gating results 
in neuronal or glial anion conductances that follow changes 
in substrate concentrations and thus allow feedback control 
of glutamate release (Wersinger et al., 2006) or modification 
of GABAergic postsynaptic currents by glutamatergic signals 
(Winter et al., 2012). Moreover, it explains why isoform-specific 
variations in glutamate transport by EAATs result in the forma- 
tion of anion channels that preferentially open or close within 
their physiological voltage range (Schneider et al., 2014). 
Recently, gain-of-function mutations in genes encoding EAAT 
anion channels have been linked to pathological neuronal 
excitability and cell-volume regulation (Winter et al., 2012). 
EAAT anion channel activity is also enhanced under conditions 
of increased synaptic glutamate concentration and may thus 
contribute to the clinical symptoms associated with brain 
ischemia or certain neurodegenerative diseases. The structural 
and mechanistic data presented here might help in the design 
of EAAT anion channel modulators and thus open therapeutic 
avenues to correct the cellular defects linked to these patho- 
logical conditions. 

EXPERIMENTAL PROCEDURES 
Molecular Simulations 

MD simulations of Gitp^— bound by a negatively charged aspartate and two 
Na"^ ions— in outward-facing (OFC; Protein Data Bank [PDB] ID code 2NWX), 
inward-facing (IFC; PDB ID code 3KBC), and various intermediate conforma- 
tions (including ICcystaii PDB ID code 3V8G) were performed using GROMACS 
4.5 (Hess et al., 2008). Based on our OFC and IFC simulation trajectories, we 
obtained intermediates ICjnt, ICoen, and ICout from the crystallographic struc- 
tures of OFC and IFC using essential dynamics sampling simulations (Amadei 
et al., 1996). Proteins were inserted and equilibrated in a double dimyristoyl 
phosphatidylcholine bilayer surrounded by a 1 M NaCI aqueous solution and 
were subjected to various membrane potentials using the computational elec- 
trophysiology scheme described recently (kutzner et al., 2011). 

Molecular Biology 

Mutant constructs of GItph, human EAAT4, and rat EAAT2 were generated us- 
ing the QuikChange Site-Directed Mutagenesis Kit (Agilent Technologies) and 
verified by restriction analysis and DNA sequencing. 
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Fluorescence Spectroscopy 

Fluorescence emission spectra of single-tryptophan GItph mutants in n-do- 
decyl-p-D-maltoside mioelles in the presence of saturating [Na"^] and [Asp“] 
at various [l“] were recorded after excitation at 295 nm. Fluorescence lifetimes 
were determined through time-correlated single-photon counting. 

Electrophysiology 

Heterologous expression and whole-cell patch-clamp recordings of EAAT2 
and EAAT4 were performed as described previously (Machtens et al., 
2011a). Unitary conductances were determined by nonstationary noise anal- 
ysis of current responses to 300 repetitive voltage jumps to ±150 mV using 
140 mM NOs“ as main anion and 1 mM aspartate as substrate to enhance 
voltage-dependent gating of the channel. 

Statistics 

Asterisks indicate the level of statistical significance derived from a two-tailed 
t test (*** p < 0.001 ; ** p < 0.01 ; * p < 0.05; ns, p > 0.05; see Table SI). 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes Extended Experimental Procedures, seven 
figures, one table, and one movie and can be found with this artiole online at 
http://dx.doi.Org/10.1016/j.cell.2014.12.035. 
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SUMMARY 

The mammalian radiation has corresponded with 
rapid changes in noncoding regions of the genome, 
but we lack a comprehensive understanding of regu- 
latory evolution in mammals. Here, we track the 
evolution of promoters and enhancers active in liver 
across 20 mammalian species from six diverse 
orders by profiling genomic enrichment of H3K27 
acetylation and H3K4 trimethylation. We report 
that rapid evolution of enhancers is a universal 
feature of mammalian genomes. Most of the recently 
evolved enhancers arise from ancestral DMA exap- 
tation, rather than lineage-specific expansions of 
repeat elements. In contrast, almost all liver pro- 
moters are partially or fully conserved across these 
species. Our data further reveal that recently evolved 
enhancers can be associated with genes under 
positive selection, demonstrating the power of this 
approach for annotating regulatory adaptations in 
genomic sequences. These results provide impor- 
tant insight into the functional genetics underpinning 
mammalian regulatory evolution. 

INTRODUCTION 

Most mammalian genes are controlled by collections of 
enhancer regions, often located tens to hundreds of kilobases 
away from transcription start sites. Recent studies comparing 
key selected mammals (Cotney et al., 2013; Xiao et al., 2012) 
have indicated that enhancers may change rapidly during evolu- 
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tion (Degner et al., 2012; Shibata et al., 2012), particularly when 
compared with evolutionarily stable gene expression patterns 
(Brawand et al., 2011; Chan et al., 2009; Merkin et al., 2012). 
Given that most phenotypic differences are hypothesized to 
largely result from regulatory differences between mammals, it 
is of profound importance to understand the mechanisms driving 
enhancer evolution (Villar et al., 2014; Wray, 2007). 

Both conserved and recently evolved enhancer sequences 
have been shown to have important phenotypic consequences. 
Highly conserved enhancer sequences can regulate funda- 
mental processes, such as embryonic development, and this 
property has been used to screen for functional regulatory ele- 
ments (Pennacchio et al., 2006). However, sequence-level 
changes in enhancer elements can also underlie evolutionary dif- 
ferences between species (Hare et al., 2008; Ludwig et al., 2005), 
as has now been demonstrated across many organisms (Arnold 
et al., 2014; Cotney et al., 2013; Degner et al., 2012; McLean 
et al., 2011; Shibata et al., 2012). 

Approaches comparing vertebrate genome sequences, such 
as those employing 29 mammals, have revealed regulatory re- 
gions under sequence constraint (Lindblad-Toh et al., 2011). 
However, this approach is limited in resolving tissue-specific 
deployment or regulatory activity directed by small sequence 
changes, particularly as may be predicted for rapidly evolving 
enhancer regions (however, see Pollard et al., 2006; Prabhakar 
et al., 2006). Comparative analysis of mammalian genomes 
can indicate protein sequence adaptations in particular species 
or lineages, and infer which coding regions are under positive 
selection. In contrast, complementary experimental efforts are 
currently lacking to functionally annotate the many recently 
sequenced mammalian genomes. 

Experimental tools can now empirically identify regulatorily 
active DNA across entire mammalian genomes. Enhancers can 
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be identified by mapping regions enriched for acetylated lysine 
27 on histone H3 (H3K27ac) via chromatin immunoprecipitation 
followed by high-throughput sequencing (ChIP-seq) (Creyghton 
et al., 2010). Similarly, active gene promoters can be identified 
as containing both H3K27ac and trimethylated lysine 4 of 
histone H3 (H3K4me3), which marks sites of transcription 
initiation (Cain et al., 201 1 ; Santos-Rosa et al., 2002). The useful- 
ness of this approach to map regulatory activity genome-wide 
has been recently underscored by analysis of H3K27ac dy- 
namics across organ development in mouse (Nord et al., 
2013). This study found that most H3K27ac developmental vari- 
ation occurs distally to transcription start sites and within pre- 
dicted enhancer elements, most of which could be validated 
experimentally. 

Over 20 sequenced mammalian genomes have been inte- 
grated into inter-species alignments within EnsembI (Flicek 
et al., 2014). Exploiting this computational infrastructure (and 
related resources in Drosophila-, Kim et al., 2009), recent studies 
have dissected how transcription factor (TF) binding has evolved 
(Fie et al., 2011; Paris et al., 2013; Schmidt et al., 2010; Stefflova 
et al., 2013). In addition, enhancer and promoter evolution have 
been investigated using sets of mammals, where fH3K27ac levels 
have been characterized across tissues and developmental 
states as a proxy for enhancer function and developmental 
or tissue-specific gene expression (Cotney et al., 2013; Nord 
et al., 2013; Xiao etal., 2012). 

Flere, we report the results of empirically mapping promoter 
and enhancer evolution across 20 mammals chosen to span 
the breadth and depth of the class Mammalia, including previ- 
ously uncharacterized species such as cetaceans and naked 
mole rat. Our analyses have revealed the tempo and mecha- 
nisms underlying enhancer evolution across over 180 million 
years of mammalian radiation. 

RESULTS 

Profiling Promoter and Enhancer Regulatory Evolution 
in Mammalian Liver 

We mapped the active promoter and enhancer elements in liver 
as a representative adult somatic tissue from 20 species of mam- 
mals (Figure 1). Study species were selected using three criteria: 
(1) to capture a substantial fraction of the mammalian phyloge- 
netic tree, (2) to profile the major placental orders in a combi- 
nation of intra- (6-40 Ma) and inter-lineage (100-180 Ma) 
evolutionary distances, and (3) to extend our understanding of 
regulatory evolution to previously uncharacterized mammals 
whose phenotypes are highly divergent, such as cetaceans, 
naked mole rat, and Tasmanian devil. Liver from almost all study 
species was profiled in biological replicates from two or more in- 
dividuals, except for Sei Whale (Balaenoptera borealis), where 
only one individual’s tissue was available; and for dolphin, for 
which we combined data from two closely related dolphin spe- 
cies {Delphinus delphis and Lagenorhynchus albirostris) where 
a single individual from each species was profiled (Tables SI 
and S2, Experimental Procedures). 

We quantified using ChIP-seq the genome-wide occurrence 
of two key histone marks widely used to profile promoters 
and enhancers: fH3K4me3 and FI3K27ac (Figure 1) (Creyghton 



et al., 2010; Santos-Rosa et al., 2002). We identified regions 
enriched for these histone marks within each mammalian liver 
genome using only biologically reproducible peaks present in 
two or more replicates (Figure SI , Experimental Procedures). 

A total of 30-45,000 regions per species were enriched in liver, 
and these separated into FI3K27ac, FI3K4me3&FI3K27ac, and 
FI3K4me3-marked elements (Figures 1C and SI). Our analyses 
were robust to variability in the genome assembly quality and 
sample preparation (Experimental Procedures and Figure S2). 
We confirmed that FI3K4me3 often co-occupied the genome 
with H3K27ac (Fleintzman et al., 2009; Zhu et al., 2013), and 
that most FI3K4me3-positive regions occur at transcriptional 
start sites (Cain et al., 2011; Santos-Rosa et al., 2002), regard- 
less of their FI3K27ac enrichment (see Experimental Proce- 
dures). In contrast, regions enriched for FI3K27ac often were 
not enriched for FI3K4me3, and these often located far from 
transcriptional start sites (Figure S2). 

The regions we identify as enhancers strongly enrich for regu- 
latory activity in liver, consistent with numerous prior studies 
(Cotney et al., 2013; Creyghton et al., 2010; Nord et al., 2013; 
Zhu etal., 2013). For over 400 of our human liver enhancers (typi- 
cally 2 kb in length), the transgenic activities of overlapping 
1 45 bp segments were assayed in liver cancer cells (Kheradpour 
et al., 2013) (Figure S2). Although each human liver enhancer 
was on average represented by only a single small sequence 
element, capturing less than 10% of the enhancer length, over 
65% showed activity in transgenic assays in a cancer cell line. 
Furthermore, over 90% of the enhancers not active in transgenic 
assays were nevertheless bound in human liver by at least one 
liver-specific TF (Ballester et al., 2014). In sum, this analysis sug- 
gests a sizable majority of our empirically determined enhancers 
are regulatorily active. 

Our data newly demonstrates that the known interplay of 
FI3K4me3 and ff3K27ac creates a genomic regulatory land- 
scape that is a uniform feature across mammals (and likely 
across eumetazoans; Schwaiger et al., 2014). In adult liver, 
a typical mammalian genome contains on average 12,500 
FI3K4me3 locations (representing active promoter elements) 
and 22,500 F13K27ac-enriched regions (representing active 
enhancers). 

Enhancer Evolution Is Appreciably More Rapid Than 
Proximal Promoter Evolution 

We used our genome-wide mapping data in livers from 20 
mammals to obtain an empirical and quantitative understanding 
of evolutionary stability of promoters and enhancers (Figure 2 
and Figure S3). 

Most non-coding regions in the human genome cannot be 
mapped across 20 mammals, in large part because the genome 
structure and regulatory content of complex eukaryotes evolve 
rapidly (Lynch et al., 201 1). We defined the maximum detectable 
conservation of activity as the number of species in which the 
DNA could be aligned (Figure 2A). For example, if enhancer ac- 
tivity is highly conserved, then this activity would be detected 
in all species where the underlying DNA was alignable. In 
contrast, low conservation would be characterized by the under- 
lying DNA remaining alignable across many species, but without 
sharing of enhancer activity. Such low conservation could be a 
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Figure 1. In Vivo Regulatory Activity Assessed in Livers from 20 Mammals 

(A and B) Phylogenetic relationships and species divergences are represented by an evolutionary tree, which inciudes 1 8 placental species (in four orders) and 2 
marsupial species (in two orders). In liver isolated from each species, enhancer activity was globally mapped by identifying genomic regions enriched for 
acetylation of H3K27 (H3K27ac), and transcription initiation was mapped by identifying genomic regions enriched fortri-methylation of H3K4 (H3K4me3). Shown 

(legend continued on next page) 
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signature of rapid functionai evoiution or, aiternativeiy, functionai 
neutraiity. 

Coiiectively, the DNA sequences used as promoters and the 
DNA sequences used as enhancers in iiver show oniy slight 
differences in their alignabiiity across the study species (Fig- 
ure 2B). This aiignabiiity shows a marked increase at approxi- 
mateiy 11-13 species, refiecting the contribution to the muitipie 
aiignments of the ten highest-quaiity genomes (Experimentai 
Procedures). 

The conservation of active iiver promoters tracked remarkabiy 
cioseiy with the aiignabiiity of the underiying DNA, indicating 
evoiutionariiy stable promoter activity (Figure 2C, upper ieft trian- 
gie). in other words, the transcription initiation sites driving gene 
expression in iiver are highiy conserved. 

We performed a simiiar anaiysis for enhancers. Our data 
reveai that rapid enhancer evoiution, often invoiving exaptation 
of ancestrai DNA, is active and widespread across aii the 
mammaiian ciades in our study (Figure 2D, orange, and Fig- 
ure S3), as has been reported in primates (Cotney et ai., 
2013). Furthermore, the ten highest-quaiity piacentai genome 
sequences contained thousands of cross-alignabie regions 
where enhancer activity was shared in many, but not aii, spe- 
cies. These regions are iiver enhancers that were likeiy present 
in the common piacentai ancestor and have partiaiiy degraded 
aiong some iineages. In contrast to promoter sites, enhancer 
iocations evolve rapidly, and comparatively few are deeply 
conserved (see below). Control analyses show that while pro- 
moter conservation may be under-estimated, this is not the 
case for enhancers (Figure S3). 

We asked whether the conservation of liver promoters and en- 
hancers is associated with underlying sequence features (e.g., 
TF binding sequences, %GC content, sequence constraint), 
experimental features (reproducibility, occupancy level/inten- 
sity, length), or some combination (Figure 3). The best predictor 
of conservation in promoter regions is the reproducibility and 
strength of enrichment of FI3K4me3 and FI3K27ac, with the 
length of the histone-modified domain and GC content as 
separate, modest contributors. Thus, experimental features are 
stronger indicators of the conservation of regulatory activity, 
and underlying sequence features contribute less to promoter 
stability. In contrast, the presence of TF binding sites can explain 
a modest fraction of the conservation of enhancer activity. 
Nevertheless, as with promoters, the enrichment reproducibility 
and intensity of signal is the primary predictor of conservation. 
Collectively, no combination of sequence- and experimental- 
based features could potentially explain more than a third of 
the variance in conservation of regulatory activity. 

Overall, our data reveal that promoter activity in a representa- 
tive somatic tissue is highly constrained across mammalian 
space. In contrast, enhancer evolution is rapid and widespread. 
Neither enhancer nor promoter activity conservation can be 
explained purely by underlying sequence elements. 



Quantifying the Divergence Rates of Enhancers, 
Promoters, and TF Binding in a Cross-Section of 
Mammals 

The divergence rate of sequence-specific transcription factor 
binding (Stefflova et al., 2013) and the extent of regulatory evolu- 
tion (Cotney et al., 2013; Shibata et al., 2012; Xiao et al., 2012) has 
been estimated using matched experiments from the same tis- 
sues in subsets of typically three to five mammals within a single 
order. We took a similar approach to calculate how rapidly en- 
hancers and promoters active in liver evolve across 20 mammals. 

We first identified, by pairwise analysis of all 20 species, 
whether regions called as enhancers and promoters were pre- 
sent in the same location between two mammalian genomes 
(Experimental Procedures, Figure S4). Because this analysis 
does not use human as the primary reference genome, we could 
generate multiple independent estimates of how evoiutionariiy 
stable enhancers and promoters were for comparable diver- 
gence distances. Further, divergence rates could be estimated 
for evolutionary distances not available from a human-centric 
analysis. For instance, our data provided multiple comparisons 
of species separated by 40 to 100 Ma using mouse, cow, or 
dog as reference that could not be obtained using a human- 
centric approach (Figure 1). 

Inter-species conservation of promoters and enhancers could 
be plausibly described as a function of time-of-divergence by 
fitting an exponential decay curve (Experimental Procedures). 
In liver, promoters diverged at a slower rate than did either en- 
hancers or TF bound regions (Figure 4 and Figure S4). Interest- 
ingly, promoters’ half-lives are comparable to protein-coding 
genes’ half-lives, at over a billion years (Rands et al., 2014). 
The higher stability of promoters versus enhancers could be 
due in part to the intimate functional connection promoters 
have with the first exon of protein coding genes, which are highly 
stable features of vertebrate genomes (Lindblad-Toh et al., 
2011). Our results are consistent with a model where the 
increased size and sequence heterogeneity of regions with pro- 
moter or enhancer activity could buffer evolutionary changes 
more robustly than can site-specific TF binding alone (Cotney 
et al., 2013; Shibata et al., 2012; Xiao et al., 2012). 

Highly Conserved Regulatory Regions Are Largely 
Proximal Promoters 

Our mapping of liver enhancer and promoter evolution using 
mammals spanning both intra-order (6-40 Ma) and inter-or- 
der (80-180 Ma) divergence times permits the dissection of 
conserved (and recently evolved, see below) regulatory regions. 

We first quantified how many regions showed strong con- 
servation of activity by defining regions as highly conserved if 
regulatory activity was present in (at a minimum) all ten of the 
highest-quaiity placental genomes (Figure 5A). A total of 2,151 
genomic regions appeared highly conserved by these criteria, 
representing 5% of all human regions active in liver. The 



are examples of regulatory regions active: (A) across all 20 species (MOSPD2 and CCDC93 loci), and (B) active only in primates (GRLH3 and PCKSK8, top) or 
active only in carnivores (UGT1A6 and ABCB1 1 , bottom). For order-specific regulatory regions, data from some species are not shown for conciseness. 

(C) In liver, a typical mammalian genome contains ~22,500 enhancers enriched for only H3K27ac; ~1 2,500 promoters enriched for both H3K27ac and H3K4me3 
and ~1 ,000 containing only H3K4me3. Highest quality genomes incorporated into the EPO multiple alignment are labeled in blue (Experimental Procedures). 
See also Figures S1 and S2 and Tables S1 and S2. 
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Figure 2. Enhancers Evolve Rapidly; Promoters Are Highly Conserved 

(A) For a representative 10 MB region on human chromosome 1 , the bar chart on the y axis represents the number of species in which enhancer and promoter 
elements were active (promoters: top, purple; enhancers: bottom, orange). Squares indicate the number of species where the sequence underlying the active 
promoter or enhancer was alignable. 

(B) The DNA sequences underlying proximal promoters and the DNA sequences underlying enhancers can be aligned to similar numbers of species, suggesting 
that differences in apparent conservation of activity are not due to differences in alignability. 

(C) Schematic diagram showing how the conservation of regulatory activity versus DNA alignability across 20 species of mammals can reveal (top) where DNA 
function and DNA sequence orthology closely correspond, indicating ancestral activity, and {bottom) where pre-existing DNA sequences have been exapted 
within specific lineages or species, indicating recently evolved activity. 

(D) Our data revealed that if the DNA underlying a human-identified proximal promoter region (purple) can be aligned with an orthologous sequence in another 
species, then promoter activity is very often present as well (heatmap enrichment concentrated on the diagonal of the plot). In contrast, most enhancer regions 
(orange) are rapidly evolving within older DNA sequences, reflected in increased heatmap enrichment toward the lower x axis. Color scales and dashed contour 
lines indicate absolute numbers of active promoter or enhancer regions (logarithmic scale). 

See also Figure S3. 



existence of over 2,000 highly conserved regions is greater than 
expected by chance (p value < 1 x 10“"^, random permutation 
test, Experimental Procedures). 



Highly conserved regions were classified as promoters or 
enhancers based on their consensus histone mark enrichment 
across all 20 mammals (Experimental Procedures). Of these 
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Figure 3. Features Contributing to Conservation of Promoter and Enhancer Activity Identified in Human Liver 

(A) For all human proximal promoters active in liver, the depth ot conservation was correlated with experimental features (reproducibility, peak intensity, peak 
length, distance to nearest transcription start site) as well as underlying genomic features (GC content, sequence constraint, TF binding sites). Each feature in 
isolation explained a significant fraction of the variance in conservation of promoter activity (e.g., peak length explained 10%). The fraction explained by the 
features in combination, when added left to right using multiple regression analysis, are plotted as a line above, in sum totaling 36%. The increases in explained 
variance with the addition of each feature are attenuated due to strong inter-correlation of features, quantified in the bottom panel as values between features 
(Experimental Procedures). 

(B) The same analysis was performed for human liver enhancers, where experimental and genomic features together explained a more modest fraction (23%) of 
the conservation of enhancer activity in other species. 



2,151 highly conserved regulatory regions, 1,871 elements 
(87%) were enriched for both H3K27ac and H3K4me3, consis- 
tent with acting as promoters (Santos-Rosa et al., 2002).The 
vast majority of highly conserved promoters occupied the tran- 
scription start sites of genes (Figure 5B). On the other hand, a 
subset of 279 regions showed enrichment only for H3K27ac 
occupancy, consistent with acting as enhancers (Creyghton 
et al., 2010). Most highly conserved enhancers were tens to 
hundreds of kilobases away from the nearest gene (Figure 5B). 
The single region uniformly enriched across placentals for only 
FI3K4me3 is not shown. 

In human liver, there are 1 1 ,838 promoter regions enriched for 
both H3K27ac and H3K4me3, and 28,963 enhancer regions 
containing only H3K27ac. Although nearly three times as com- 
mon as promoters, the activity of only 1 % of these enhancers 
is highly conserved. In contrast, the activity of 1 6% of promoters 
is highly conserved (Figure 5A). 

Three independent lines of evidence support the functionality 
of the sequences we identify as highly conserved regulatory 
regions in liver. First, all show enhanced sequence constraint 
(Figure 5C). Second, genes near highly conserved enhancers 
are strongly enriched for liver-specific functions, and genes 
near conserved proximal promoters are enriched for house- 



keeping functions (Figure S5, Tables S3 and S6) (Forrest et al., 
2014). Third, highly conserved enhancers are enriched for TF 
binding motifs for liver-specific regulators such as CEBPA and 
PBX1, whereas highly conserved proximal promoters appear 
dominated by transcriptional initiation regulatory sequences 
(Figure S5, Table S7). 

In sum, in adult mammals comparatively few enhancers are 
evolutionarily stable. In contrast, a substantial fraction of the 
proximal promoters found in human liver appear to be highly 
conserved across mammals. 

Recently Evolved Regulatory Activity Is Pervasive in 
Mammals 

Even for proximal promoters, the number of highly conserved 
regulatory elements active in liver is a small fraction of the total 
number experimentally identified in any single species (Figure 5 
and Table S4). We sought to identify and analyze the molecular 
features of more recently evolved regulatory regions. 

From each placental order, we selected a representative spe- 
cies (human, mouse, cow, dog) and then identified a set of newly 
evolved or, more formally, apomorphic active promoters and en- 
hancers in liver (Figure 6 and Figure S7). For each of these four 
species, we started with all active regions and then removed 
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Figure 4. Empirically Determined Rates of Promoter, Enhancer, 
and TF Binding Divergence in Liver across 180 Million Years of 
Mammalian Evolution 

(A) For promoters (purple), enhancers (orange), and TF binding sites (CEBPA, 
black), the fraction of ChIP-seq peaks present at the orthologous location 
between pairs of mammals are shown as a function of evolutionary distance. 
Solid lines represent an exponential decay fit, surrounded by gray shading of a 
95% confidence interval (Experimental Procedures). For liver promoters and 
enhancers, we used data from the ten highest-quality placental genomes, 
while CEBPA data have been previously reported (Schmidt et al., 2010). 

(B) Comparative half-lives and mean-lifetimes (in million years) for active 
promoters, enhancers and CEBPA transcription factor binding locations, as 
calculated from the exponential decay fits in (A). 

(C) Neighbor-joining phylogenetic trees based on pairwise conservation levels 
of enhancer and promoter activity, as measured in (A). Enhancer evolution 
(orange) recapitulates the known relationships among the studied mammals 
(black). The low divergence of promoter activity is insufficient to resolve the 
phylogenetic groups (purple). 

See also Figure S4. 
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those that showed any activity within aiignabie regions in any 
other study species (see Experimentai Procedures). We found 
that a typicai mammaiian iiver deploys between 1,000 to 2,000 
promoters and 10,000 enhancers not found in any other study 
species; we henceforth refer to these enhancers and promoters 
as recently evolved. 

These numbers are comparable to the extent of enhancer 
gains previously reported in inter-primate comparisons (Cotney 
et al., 2013; Shibata et al., 2012) and the extent of promoter evo- 
lution estimated from mouse-human comparisons (Forrest et al., 
2014; Frith et al., 2006). Especially for enhancers, recently 
evolved regions are 10-20 times more abundant than those 
conserved across placentals or shared across multiple species 
in a particular lineage (Table S4). Both highly conserved and 
recently evolved regulatory regions active in liver are associated 
with increased expression of neighboring genes (Figure S6). 
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Exaptation Drives Recently Evolved Enhancer, but Not 
Promoter, Activity 

Using these tens of thousands of apomorphic regulatory regions, 
we tested whether functional exaptation of ancestral DNA, 
recently reported for human-specific enhancers active in embry- 
onic limb (Cotney et al., 2013), is a prevalent mechanism in 
mammalian genome evolution. 

We first asked whether recently evolved proximal promoters 
are primarily found in ancestral DNA sequences older than 1 00 
Ma (Figure 6A, Figure S7). To our surprise, we discovered that 
across four orders of mammals, the recent evolution of pro- 
moters occurred within evolutionarily younger DNA segments 
(i.e., not shared with other study species) about three to four 
times as often as occurred by exaptation of ancestral DNA. For 
instance in mouse, 1 ,400 recently evolved promoters occurred 
in DNA sequences present only in this species (i.e., not shared 
even with rat); in contrast, only 260 recently evolved promoters 
were found in ancestral DNA. 

Within the ancestral DNA commandeered into new promoters, 
and regardless of species interrogated, diverse ERV repeat ele- 
ments are over-represented, consistent with previous reports 
that ERVs are pre-primed to transcriptional initiation (Fort 
et al., 2014). 
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Figure 5. Most Highly Conserved Liver Reg- 
ulatory Regions Are Proximal Promoters 

(A) The ~41,000 regulatorily active regions in hu- 
man liver are shown on the left panel (enhancers: 
orange; promoters: purple). The regulatory ele- 
ments with conserved activity in the ten placental 
species with highest quality genomes (boxed 
inset) were determined by cross-species com- 
parison (Experimental Procedures), identifying 
approximately 300 enhancers and 1,800 pro- 
moters (labeled as highly conserved, right panel). 

(B) Almost all highly conserved promoter regions 
(purple) are located at transcription start sites as 
expected, whereas conserved enhancer regions 
(orange) are typically tens to hundreds of kilobases 
from the nearest gene. 

(C) Regions of highly conserved enhancer and 
promoter activity show a corresponding, but 
modest, increase in selective constraint in their 
underlying DNA sequence. The distribution of 
the fraction of bases under constraint in each 
region within each category is shown as a box- 
plot, with human exons and randomly selected 
regions shown for comparison (Experimental 
Procedures).*** indicates p value < 2 x 10“^®, 
Wilcoxon test. 

See also Figures S5 and S6 and Tables S3, S6, 
and S7. 



In contrast, the vast majority of enhancers in liver are recentiy 
evoived (Table S4)— as well as far more likely to exapt ancestral 
DNA (Figure 6B). Of the typicaily 10,000 recentiy evoived en- 
hancers in a given species, 52%-77% contained sequences of 
ancestrai DNA over 100 Ma oid. The remaining recentiy evoived 
enhancers were found in younger DNA, and enriched for mobiie 
repetitive eiement famiiies, inciuding LTRs in ali iineages and 
iineage-specific SINEs and DNA transposons exciusive to pri- 
mates, carnivores, or unguiates (Figure 6B). 

In a typical mammalian species, the 1 ,000 to 2,000 recentiy 
evoived iiver promoters occur predominantiy in younger DNA 
typicaiiy less than 40 Ma old, whereas the 10,000 recently 
evolved enhancers are formed predominantly by exaptation of 
ancestral DNA. Only a minority of recently evolved enhancers 
and promoters appear driven by repeat element expansions 
(Figure 6, Figure S7). Across our study’s 20 mammals, exap- 
tation of ancestral DNA generates more of the recently evolved 
regulatory genome than do repeat-driven expansions. 

Functional Annotation of Genes under Positive 
Selection 

Comparing genome sequences can suggest which genes drive 
phenotypic adaptations by using inference of regions under pos- 
itive selection and by analyzing amino acid substitution patterns 
in proteins (Nielsen et al., 2007). Both approaches primarily 
employ coding-sequence alignments and thus provide limited 
insight into regulatory adaptations. We therefore asked whether 
genes under positive selection are associated with apomorphic 
enhancers, perhaps evolving synergistically (Shibata et al., 201 2). 



We compared recently evolved enhancers and positively 
selected genes in two newly sequenced species: (1) naked 
mole rat, a cancer-resistant rodent (Kim et al., 2011); and (2) dol- 
phin, a marine mammal metabolically adapted to an aquatic 
environment (Sun et al., 2013). In both species, we found that 
recently evolved enhancers are over-represented near positively 
selected genes (Experimental Procedures) (p values = 0.022 
[naked mole rat] and 0.023 [dolphin], hypergeometric test. See 
Table S5). 

Illustrative examples are shown in Figure 7. First, a recently 
evolved enhancer in naked mole rat is shown upstream of the 
thymopoietin gene {TMPO), identified previously as positively 
selected (Kim et al., 2011). The orthologous TMPO regions in 
human, mouse, cow, and dog show no enhancer activity, 
though a number of partially conserved enhancers are present 
nearby (Figure 7A). Second, the genomic region around the 
TRIP12 gene, under positive selection in dolphin (Sun et al., 
2013), contains a recently evolved dolphin enhancer not active 
in human, mouse, dog, and cow. Moreover, this regulatory 
element appears to be the main enhancer in this region 
(Figure 7B). 

In sum, recently evolved active regions identified in this study, 
and in particular rapidly evolving enhancers, can functionally 
annotate lineage-specific adaptations. 

DISCUSSION 

We experimentally dissected the evolution of regulatory regions 
in mammalian liver by mapping the genome-wide landscape of 
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Figure 6. Recently Evolved Promoters Are Largely Derived from Young DNA, While Recently Evolved Enhancers Are Mostly Exapted from 
Ancestral DNA Sequences 

Regions with recently evolved promoter and enhancer activity in liver were identified in a representative species for each piacental order (primateihuman, 
rodent:mouse, unguiate:cow, and carnivore:dog). These regions were categorised into those falling in (1) young DNA sequences (0^0 Ma) or (2) ancestral DNA 
sequences (>100 Ma). 

(A) Typically three times as many recently evolved active promoters reside in young DNA as are found in ancestral DNA sequences present across placental 
mammals. 

(B) Conversely, typically twice as many recently evolved enhancers are exapted from evolutionarily ancestral DNA as are found in young DNA. 

(C and D) Repeat classes and families enriched in recently evolved promoters and enhancers were identified using a binomial test (see Experimental Procedures). 
Plots show enrichments for each repeat family (y axis) and each species (x axis). Circle sizes represent the statistical significance of enrichment, and color shades 
denote the fold change of the enrichment (both in logarithmic scale). 

See also Figures S6 and S7 and Tables S3, S4, S6, and S7. 



active promoters and enhancers from 20 diverse species. The 
evolutionary distances spanning four distinct orders within class 
Mammalia enabled rigorous analysis of the mechanisms under- 
lying regulatory evolution. The combination of rapid enhancer 
and slower promoter evolution appears to be a fundamental 
property of the mammalian regulatory genome, shared by spe- 
cies separated by up to 180 million years. A sizable number of 
the 10,000-15,000 active promoters are functionally shared 
across most mammals, and are associated with ubiquitous 
cellular functions; highly conserved enhancers are much less 
common, and are found near liver-specific genes. Remarkably, 
almost half of 20,000-25,000 active enhancers in each species 
have rapidly evolved in a lineage- or species-specific manner. 
Our genome-wide mapping of enhancers in previously unchar- 



acterized species has enabled us to identify regulatory regions 
near genes under positive selection that may help drive pheno- 
typic adaptations. 

A Global Overview of Enhancer and Promoter Evolution 
in Mammals 

We used a powerful and unbiased strategy to confirm, extend, 
and explicitly quantify previous results showing higher conserva- 
tion of active promoter regions compared to distal enhancers in 
selected representatives of mammals (Xiao et al., 201 2) or within 
primates (Cotney et al., 2013). 

Our study has a number of limitations. First, the relationship 
between different histone marks and the activity of enhancers 
is not perfectly understood. Most active enhancers are marked 
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Figure 7. Recently Evolved Enhancers 
Associate with Genes under Positive Selec- 
tion during Naked Mole Rat and Dolphin 
Evolution 

(A) The liver enhancer and promoter landscape 
surrounding the TMPO locus, which is under 
positive selection in naked mole rat (Kim et al., 
2011), is shown (upper track). The bottom four 
tracks display overlaid H3K4me3 (blue) and 
H3K27ac (orange) levels in the orthologous re- 
gions of human, mouse, dog, and cow. Shown (left 
to right) are a promoter present in all species, four 
enhancer regions shared in a subset of species, 
and a naked mole rat-specific enhancer whose 
recently evolved activity is not present in other 
study species. 

(B) The enhancer and promoter landscape sur- 
rounding the TRIP12 locus, which is under positive 
selection in dolphins (Sun et al., 2013), is shown. In 
this case, no mammals other than dolphin show 
liver enhancer activity near this gene; this 
enhancer is thus a good candidate to contain the 
regulatory regions associated with positive selec- 
tion in dolphin. 

See also Table S5. 




by H3K27ac (Andersson et al., 2014; Creyghton et al., 2010; Zhu 
et al., 2013), and typically over two-thirds of regions enriched 
for H3K27ac show independent evidence in transgenesis assays 
for regulatory activity (Nord et al., 2013). Global mapping of 
H3K4me1 and p300 can also detect poised enhancer activity 
genome-wide, which can partly differ from that identified 
by H3K27ac (Heintzman et al., 2007; Krebs et al., 2011; Visel 
et al., 2009). Second, other approaches to map regulatory 
sequences, such as DNase-seq (Shibata et al., 2012) or ATAC- 
seq (Buenrostro et al., 2013), can reveal all regions of open 
chromatin genome-wide, but cannot distinguish promoters and 
enhancers. Third, our approach does not directly reveal which 
transcription factors control these regulatory regions, as would 
a more direct comparison (Kunarso et al., 2010; Paris et al., 
2013; Schmidt et al., 2010), which in turn can only capture a 
modest subset of active regions. Fourth, our results generalize 
to other mammalian somatic tissues to the extent that adult liver 
is a representative tissue. However, other studies have sug- 
gested rapid enhancer evolution in mammals, using embryonic 
limb buds (Cotney et al., 2013), adipocytes (Mikkelsen et al., 
201 0), and embryonic stem cells (Xiao et al., 201 2). These studies 
and others (Barbosa-Morais et al., 2012; Brawand et al., 2011) 
suggest that regulation in other somatic tissues evolves similarly. 



though embryonic tissues and their en- 
hancers may be under stronger evolu- 
tionary constraint (Faure et al., 2012; He 
et al., 2011; Nord et al., 2013). Fifth, we 
cannot directly evaluate how often re- 
gions with regulatory activity are fully tis- 
sue-specific, particularly among those 
we assign as enhancers (Zhu et al., 201 3). 

One powerful strategy to dissect the 
regulatory genome has been to identify 
regions under high sequence constraint (Lindblad-Toh et al., 
2011). Testing for activity has revealed that thousands of con- 
strained noncoding regulatory sequences can act as enhancers 
in embryonic tissues (Pennacchio et al., 2006). The complemen- 
tary approach we used additionally captures rapidly evolving 
regulatory regions. The enhancer regions we mapped likely 
range in function from essential to dispensible, which is 
reflected both in the modest sequence constraint and rapid 
evolution between species. Most of these regions would likely 
be missed by any sequence-conservation based approach. On 
the other hand, many DNA sequences we do not identify as en- 
hancers may be active in other tissues or embryonic states, 
which we anticipate to be an area of active investigation. 

Rapid enhancer and slow promoter evolution is a fundamental 
property of the mammalian regulatory genome. Active enhancer 
elements have a mean lifetime three times shorter than active 
promoters do, despite similar alignability of their underlying 
DNA sequences. Comparative sequence-based approaches 
have limited power to detect regulatory regions, in part 
because of their rapid evolution (Alfoldi and Lindblad-Toh, 
2013; Lindblad-Toh et al., 2011); indeed, our data indicate 
that sequence-based features such as sequence constraint 
or TF binding site density are poor predictors of enhancer 
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conservation. Nevertheless, previous work across Drosophila 
species has indicated that specific TF motifs may be preferen- 
tially preserved in functionally conserved enhancers (Arnold 
et al., 2014). In agreement, we found motifs for the liver-specific 
transcription factor CEBPA enriched in highly conserved liver 
enhancers. 

Active Mammalian Enhancers Are Predominantly 
Apomorphic 

Our results also newly reveal thousands of funcfionally active 
regulatory regions conserved across placental mammals, the 
vast majority of which are proximal promoter sequences. 
Placental-conserved proximal promoters in mammalian liver 
are commonly associated with ubiquitously expressed genes. 
In contrast, only 12% of highly conserved regulatory regions 
are active enhancers and these are near genes associated with 
liver-specific activities. 

Perhaps our most surprising finding is that representative 
mammals typically deploy over 10,000 enhancers in a line- 
age- and probably most often species-specific manner. In total, 
almost half of all enhancers in each species appear to be recently 
evolved. Our results confirm and extend the concept that exap- 
tation is a widespread phenomenon across placental mammals 
(Cotney et al., 2013), and redeployment of ancestral DNA is the 
dominant mechanism to generate active enhancers across a 
diverse cross-section of mammals. Interestingly, a recent study 
comparing enhancer activity across the much smaller genomes 
oWwe Drosophila species (Arnold et al., 2014) found a similar pro- 
portion of gained enhancers, especially for more distant species. 

Another mechanism to create regulatory sequences is repeat- 
carried expansion of regulatory elements. Recent studies have 
indicated the involvement of specific repeat element expansions 
in the de novo creation of TF binding sites for CTCF (Bourque 
et al., 2008; Schmidt et al., 2012), Oct4/Nanog (Kunarso et al., 
2010), and NRSF (Mortazavi et al., 2006). Our results show that 
repeat-carriage of newborn enhancers is not the dominant evolu- 
tionary process in mammals: repeat element enrichment is only 
significant among the recently evolved enhancers found in DNA 
less than 40 Ma old. Two technical limitations may have caused 
us to underestimate the repeat-driven creation of recently 
evolved enhancers (also, see Jacques et al., 2013): the difficulty 
of mapping reads to recently duplicated regions, and the incom- 
plete representation of repeaf regions in genome assemblies. 

Recently Evolved Promoters, Though Less Common 
Than Enhancers, Are Mostly Found in Young DNA 

Promoters are far more evolutionarily stable than are enhancers. 
Nevertheless, the absolute number of promoters deeply 
conserved across all 20 study species is similar to the number 
of recently evolved promoters in any one species. Compared 
to the tens of thousands of newborn enhancers arising from ex- 
aptation of ancesfral DNA, there are few newborn promoters— 
and these often arise from DNA sequences that are themselves 
evolutionarily young. We were not able to identify sequence fea- 
tures that account for the birth of promoters in young DNA. In 
contrast, the recently evolved promoters arising in ancestral se- 
quences overlap LTR repeats, which enrich for latent non-coding 
RNA activity (Fort et al., 2014). 



A Strategy for Identifying the Enhancer Repertoire of 
Unannotated Genomes 

Finally, extending an approach pioneered in well-annotated 
primate genomes (Cotney et al., 2013; Shibata et al., 2012), we 
provide examples of how experimental mapping of enhancers 
and promoters in newly sequenced mammals can annotate 
the regulatory network of genes, which have been identified 
computationally as under positive selection. Across representa- 
tive species, we discovered that recently evolved enhancers 
are significantly over-represented in the vicinity of positively- 
selected genes and can often suggest candidate regulatory 
elements that could mediate species-specific adaptations. 
This result was obtained using only a single somatic tissue. Simi- 
larly, significant associations likely also exist in between the 
newly evolved enhancers specific to other somatic tissues and 
positively selected genes, which would uncover an extensive 
repertoire of highly evolvable, potentially synergistic regulatory 
connections. 

Future Directions 

Our quantitation and analysis of the evolution of promoters and 
enhancers across a wide cross-section of mammals has revealed 
how dynamic and rapid enhancer evolution is. Within this regula- 
tory diversity are the instructions by which a small number 
of founder species have radiated into surprising new niches, 
including marine (cetaceans) and aerial environments (bats). 
By combining detailed investigations of carefully selected sub- 
clades with new tools for modifying any sequenced genome, 
future studies will identify, formalize, and explore the functional 
instructions directing the diversity of mammalian forms. 

EXPERIMENTAL PROCEDURES 

We performed ChIP-seq using livertissue isolated from 20 mammalian species 
(Table SI). At least two independent biological replicates from different 
animals, generally young adult males, were performed for each species and 
antibody. The only exception was Balaenoptera borealis, for which a single 
individual was profiled, and dolphin, for which we profiled a single individual 
from two closely-related species. ChIP-seq experiments were performed as 
recently described (Aldridge et al., 2013) with antibodies against H3K4me3 
(Millipore 05-1339) and H3K27ac (Abeam ab4729). To match inter-individual 
variability for the two histone marks, the same tissue samples were used for 
both antibodies and control input DNA in each species. 

Sequencing reads were aligned to the appropriate reference genome with 
BWA v.0.5.9 (Table S2) and regions of enrichment determined with MACS 
v1 .4.2. Regions enriched in two to four biological replicates and overlapping 
by a minimum 50% of their length were merged and categorized into 
active promoters (H3K4me3-enriched regions, with or without overlapping 
H3K27ac enrichment) or enhancers (regions enriched only for H3K27ac). 
Cross-species comparisons were performed through the EnsembI API. 
Human, macaque, vervet, marmoset, mouse, rat, rabbit, cow, pig, dog, and 
cat were directly cross-compared using the 1 3 eutherian mammals EPO align- 
ment available from EnsembI (Flicek et al., 2014). Species not included in the 
EPO alignment were compared to the reference species of their respective 
clade (human, mouse, cow, dog, or opossum) using Lastz aligments. Pro- 
moters or enhancers were considered as having conserved activity between 
species when their orthologous location in the second species overlapped 
a marked region by a minimum of 50% in length. All pairwise comparisons 
correspond to average values of reciprocal comparisons between species. 
Genome annotations (including gene ontology and repetitive and constrained 
elements) were downloaded from EnsembI v73. See also Extended Experi- 
mental Procedures. 
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Acetate Fuels the Cancer Engine 
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(Cell 159 , 1492-1494; December 18, 2014) 

Due to a production error, a label in Figure 1 of the Preview above was incorrect. Within the figure, the text “ADP, P,” should have read 
“AMP, PP,.” The figure has been correcfed online and appears below. 
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Figure 1. Acetyl-CoA Is a Central Node in Carbon Metabolism 
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in characterizing ATRX as a reguiator of X inactivation, we used two shRNAs to knockdown ATRX and then visuaiized the effects on 
Xist locaiization using FISH. In Figure 1C, we erroneousiy presented dupiicate images of ceiis treated with shATRX2 showing both 
Xist and Xist plus DAPI staining in piace of images showing ceils treated with shATRXI . The quantification of Xist levels presented 
in Figure 1C reflects a separate qRT-PCR analysis and is not affected. The corrected figure, showing shATRXI -treated cells from the 
same experiment, appears below. Additionally, in the interest of experimental transparency, the accompanying figure legend has 
been updated to include the data acquisition parameters used in the experiment from which the images were derived. The figure 
and figure legend have been corrected online. 
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Figure 1. A Proteomics Screen Identifies ATRX as a Candidate XCI Regulator 

(A) IP-MS: Colloidal blue staining of FLAG IP from control (293F) and FLAG-mH2A-expressing 293 run on a4%-20% (left) and a6%(right) SDS gradient gel. FLAG 
IP was validated by western blot. 

(B) Left: Immunostaining of ATRX, EZH2, and H3K27me3 in WT and two independent stable ATRX-KD MEF lines (shATRX-1 ,-2). Sample size (n) and %EZH2 
association with Xi are shown. Middle: western blot showing ATRX depletion but constant EZH2 levels in shATRX-1 and shATRTX-2 female MEFs. Right: Patterns 
of H3K27me3 observed, n = 100-150 per experiment. 

(C) Left: Xist RNA FISH in indicated fibroblast lines with FITC acquisition times of 500 ms and a gain of 61 for all samples. Right: qRT-PCR analyses of Xist RNA 
levels. SE bars from three independent experiments are shown with Student’s t test p values. 

(D) Left: Immunostaining of ATRX and EZH2 in MEFs transiently transfected with scrambled shRNA (shScr) and two shATRX constructs (shATRX-1 , shATRX-2). 
Right: H3K27me3 staining and Xist RNA FISH show no change in the intensity or foci number after transient ATRX KD. 
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Genetic landscape of Parkinson’s Disease 
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Gene name 


Location 


Possible pathways / pathological biological processes 


MENDELIAN GENES 


SNCA 


Synuclein, alpha 


4q21 


Synaptic function; mitochondrial function; autophagy/lysosomal degradation 


PARK2 


Parkin RBR E3 ubiquitin protein ligase 


6q25.2-q27 


Mitochondrial function/mitophagy; ubiquitination; synaptic function 


PINK1 


PTEN -induced putative kinase 1 


1p36 


Mitochondrial function/mitophagy 


PARK7IDJ-1 


Parkinson protein 7 


1p36.23 


Inflammation/immune system; mitochondrial function 


LRRK2 


Leucine-rich repeat kinase 2 


12q12 


Synaptic function; inflammation/immune system; autophagy/lysosomal degradation 


PLA2G6 


Phosphoiipase A2, group VI 


22q13.1 


Mitochondrial function 


FBX07 


F-box protein 7 


22q12.3 


Ubiquitination; mitochondrial function/mitophagy 


VPS35 


Vacuolar protein sorting 35 homolog (S. cerevisiae) 


16q12 


Autophagy/lysosomal degradation; endocytosis 


ATP13A2 


ATPase type 1 3A2 


1p36 


Mitochondrial function; autophagy/lysosomal degradation 


DNAJC6 


DnaJ (Hsp40) homolog, subfamily C, member 6 


1p31.3 


Synaptic function; endocytosis 


SYNJ1 


Synaptojanin 1 


21q22.2 


Synaptic function; endocytosis 


RISK GENES 


I GBA 


1 Glucosidase, beta, acid | 


1 iq2i 


1 Inflammation/immune system; autophagy/lysosomal degradation; metabolic pathways | 


RISK LOCI 


MART 


Microtubule-associated protein tau 


17q21.1 


Microtubule stabilization and axonal transport 


RAB7L1 


RAB7, member RAS oncogene family-like 1 


1q32 


Autophagy/lysosomal degradation 


BST1 


Bone marrow stromal cell antigen 1 


4p15 


Immune system 


HLA-DRB5 


Major histocompatibility complex, class II, DR beta 5 


6p21.3 


Inflammation/immune system 


GAK/ 


Cyclin-G-associated kinase 


4p16 


Autophagy/lysosomal degradation; synaptic function; endocytosis 


ACMSD 


Aminocarboxymuconate semialdehyde decarboxylase 


2q21.3 


Tryptophan metabolism; metal ion binding; metabolic pathways 


STK39 


Serine threonine kinase 39 


2q24.3 


Infiammation/immune system; protein kinase binding; cellular stress response 


SYT11 


Synaptotagmin XI 


1q21.2 


Synaptic function; transporter activity; metal ion binding; substrate for PARK2 


FGF20 


Fibroblast growth factor 20 


8p22 


Growth factor activity; FGF receptor binding 


STX1B 


Syntaxin 1 B 


16p11.2 


Synaptic function; SNAP receptor activity; protein domain-specific binding 


GPNMB 


Glycoprotein (transmembrane) nmb 


7p15 


Integrin binding; heparin binding; cancer pathways 


SIPA1L2 


Signal-induced proliferation-associated 1 like 2 


1q42.2 


GTPase activator activity 


INPP5F 


Inositol polyphosphate-5-phosphatase F 


10q26.11 


Phosphoric ester hydrolase activity 


MIR4697HG 


MIR4697 host gene (non-protein coding) 


11q25 




GCH1 


GTP cyclohydrolase 1 


14q22.1-q22.2 


GTP binding; calcium ion binding; BH4 metab; metabolic pathways 


VPS13C 


Vacuolar protein sorting 13 homolog C (S. cerevisiae) 


15q22.2 


Endocytosis 


DDRGK1 


DDRGK domain containing 1 


20p13 


Protein binding 


MCCC1 


Methylcrotonoyl-CoA carboxylase 1 (alpha) 


3q27 


Biotin carboxylase activity; methylcrotonoyl-CoA carboxylase activity; metabolic pathways 


SCARB2 


Scavenger receptor class B, member 2 


4q21.1 


Autophagy/lysosomal degradation; receptor activity (lysosomal receptor for GBA targeting); 
enzyme binding 


CCDC62 


Coiled-coil domain containing 62 


12q24.31 


Nuclear receptor coactivator; cancer pathways 


RIT2 


Ras-like without CAAX 2 


18q12.3 


Synaptic function; calmodulin binding; GTP binding 


SREBF1 


Sterol regulatory element binding transcription factor 1 


17p11.2 


Chromatin binding; cholesterol and steroid metabolic processes 
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Different types of genetic technologies and approaches allow for the study and identification of different types of genetic variability in a disease. Here, represented are the 
genes and genetic loci independently replicated as being associated with the development of Parkinson’s disease (PD)/parkinsonism. 

Genetic analyses of familial cases (mainly genetic linkage [blue area] and, more recently, exome sequencing [green area]) have led to the identification of causative muta- 
tions in 11 genes implicated in monogenic typical or atypical forms of parkinsonism. From these, eight genes have been associated with autosomal-recessive patterns 
of inheritance, either causing typical early-onset PD {PARK2, PINK1, and DJ-1/PARK7) or atypical forms of parkinsonism with juvenile onsets {ATP13A2, PLA2G6, FBX07, 
DNAJC6, and SYNJ1). Three genes have been shown to cause typical autosomal dominant PD phenotypes {SNCA, LRRK2, and VPS35) associated with early- or late-onsets 
of disease. Additionally, other genes are known to harbor mutations associated with non-PD disorders that may present with parkinsonism, for example ATXN2, ATXN3, GCH1, 
GRN, MART, C90RF72, CSF1R, TH, and SPG1. Very recently, mutations in RAB39B have been described as causing X-linked intellectual disability plus a phenotype indistin- 
guishable from early-onset PD. 

Other chromosomal loci (like PARK3 and PARK10, for example) have been identified by genome-wide approaches as genomic regions associated with typical PD. These 
loci may contain other, still-to-be-identified, PD genes. 

Lastly, there are also some genes that have been suggested to harbor causative PD mutations, which have not been confirmed: GIGYF2, HTRA2, UCHL1, EIF4G1, and SPR. 
By using genetic family studies (for LRRK2) and a candidate gene approach (for GBA), high-risk variants (with odds ratio in the range of 5 8) for the development of typical 
PD have been identified in both genes (central part of the graph). 

More recently, the development of whole-genome genotyping platforms (pink area of the graph) has allowed for the study of the involvement of common variants with 
low risk in the disease. This has led to the identification of 24 new genetic loci by several independent genome-wide association studies (GWAS) and meta-analyses: SNCA, 
LRRK2, MART, RAB7L1, BST1, HLA-DRB5, GAK, ACMSD, STK39, MCCC1, SYT11, CCDC62, FGF20, STX1B, GPNMB, SIPA1L2, INPP5F, MIR4697HG, GCH1, VPS13C, SCARB2, 
RIT2, DDRGK1, and SREBF1. Because of the way these studies are designed, they only identify genetic regions associated with disease and not specific genes or variants. For 
this reason, if the significant single nucleotide polymorphisms (SNPs) are intergenic or the region contains more than one gene, the locus usually gets its name from the gene 
closest to the significant hit. Very few of these significant hits have a clear functional role in the disease and, because of this, follow-up work is currently underway to determine 
exactly which genes and genetic variants are important for the disease and how they are exerting their effect. Recently, an unbiased screen for interactors of LRRK2 identified 
the most likely candidates for two of these GWAS loci: in chromosome 1q32, the associated locus (originally named RAB7L1/NUCKS1) contained five genes, and in chromo- 
some 4p16, the associated locus (originally named GAK/TMEM175/DGKQ) contained nearly ten genes. RAB7L1 and GAK have now been identified as LRRK2 interactors and, 
in this way, as the most likely hits in each region. Additionally, these proteins were shown to form a newly identified protein complex that promotes clearance of Golgi-derived 
vesicles via the autophagy-lysosome system both in vitro and in vivo, clearly highlighting the role the “Autophagy/Lysosomal degradation” pathway in Parkinson’s disease. 

More generally, pathway analyses of GWAS data implicated other biological processes as primary etiological events in the disease with significant overrepresentation of 
association signals in pathways related to “regulation of leucocyte/lymphocyte activity,” “cytokine-mediated signaling,” “axonal guidance,” “focal adhesion,” and “calcium 
signaling.” 

Representation of genes within each group in the graph is approximate and does not reflect differences in frequency or risk. 

Pleomorphic Risk— Exemplified by SNCA 

This panel illustrates that, at the same locus, several disease-related genetic mechanisms may co-exist, each influencing disease through different biological effects on a 
single gene. In this particular model, expression of a gene is positively correlated with risk shown by duplication mutations causing Parkinson’s disease. Five coding muta- 
tions have been identified as the cause of disease in early-onset familial cases. Duplications and triplications of the SNCA locus have also been implicated as the cause of 
early-onset Parkinson’s disease. Interestingly, GWAS have also identified two different association signals in this locus, representing common variability with a low effect in the 
disease. Possible protective variants are also represented in the graph. 
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